Is it possible to use the AngelList API to get the entire list of companies (startups) inside the AngelList website without knowing the ID of all of them?
Or, is there a way to get all of the company IDs?
I'm trying a JSON parser with random URL, because the AngelList URLs are randomly used for location, market and other.
I would like to obtain all the AngelList startups and companies in a txt file
for (int h = 1612; h<=1885; h++){
do {
// change the URL as per the requirement and also paginating it
String libURL = "https://api.angel.co/1/tags/" + h + "/startups?page="
+ i;
InputStream in = URI.create(libURL).toURL().openStream();
// writing each page into a seperate file
FileOutputStream fout = new FileOutputStream(
"/Users/Fabio/Desktop/FilesAngellist/file" + i + ".txt");
byte data[] = new byte[1024];
int count;
while ((count = in.read(data, 0, 1024)) != -1) {
fout.write(data, 0, count);
}// end of while
if (i == 1) {
DownloadJobJson db = new DownloadJobJson(); // code to pull
// "last_page" value
// from json file
pagenumber = db.DownloadJobJson1();
}// end of if
i = i + 1;
} while (i <= pagenumber);// end of do-while()
}
This is the code of the JSON downloader from AngelList URL
Solved using linkedin j project
Related
I have a project that used to split a pdf file that uploaded by user, after split then get the same content inside pdf then merge the page base on pdf content using PDODocument and for merge pdf i use PDFMergerUtility, after marge i save the merge pdf to database using bytearray.
and, after save to DB, user also can download the pdf that already split and merge base on content and reupload when needed.
but i have found a problem, after merge the size of pdf is bigger than pdf before split.
i have try to found the solution, but not found that working to my problem, such us
Android PdfDocument file size
Is there a way to compress PDF to small size using Java?
and another else solution
is there any solution to solve my problem?
I would be glad for any help.
and here is my code
//file: MultipartFile -> file is send from front-end using API
var inpStream: InputStream = file.getInputStream()
inpStream = file.getInputStream()
pdfDocument = PDDocument.load(inpStream)
// splitting the pages of a PDF document
pagesPdf = splitter.split(pdfDocument)
val n = pdfDocument.numberOfPages
val batchSize:Int = 200
val finalBatchSize: Int = n % batchSize
val numOfBatch: Int = (n - finalBatchSize) / batchSize
val batchFinal: Int = if (finalBatchSize == 0) numOfBatch else (numOfBatch + 1)
var batchNo: Int = 1
var startPage: Int
var endPage: Int = 0
while (batchNo <= batchFinal) {
startPage = endPage + 1
if (batchNo > numOfBatch) {
endPage = endPage + finalBatchSize
} else {
endPage = endPage + batchSize
}
val splitter:Splitter = Splitter()
splitter.setStartPage(startPage)
splitter.setEndPage(endPage)
// splitting the pages of a PDF document
pagesPdf = splitter.split(pdfDocument)
batchNo++
i = startPage
var groupPage: Int = i
var pageNo = 0
var pdfMerger: PDFMergerUtility = PDFMergerUtility()
var mergedFileByteArrOut: ByteArrayOutputStream = ByteArrayOutputStream()
pdfMerger.setDestinationStream(mergedFileByteArrOut)
var fileObj:ByteArray? = null,
for (pd in pagesPdf) {
pageNo++;
if (!pd.isEncrypted) {
val stripper = PDFTextStripper()
//CODE TO GET CONTEN
if(condition1 == true){
var fileByteArrOut: ByteArrayOutputStream = ByteArrayOutputStream()
pd.save(fileByteArrOut)
pd.close()
var fileByteArrIn: ByteArrayInputStream = ByteArrayInputStream(fileByteArrOut.toByteArray())
pdfMerger.addSource(fileByteArrIn)
fileObj = fileByteArrOut.toByteArray(),
}
if(condition2 == true){
//I want to compress fileObj first before save to DB
//code to save to DB
fileObj = null
pdfMerger = PDFMergerUtility()
mergedFileByteArrOut= ByteArrayOutputStream()
pdfMerger.setDestinationStream(mergedFileByteArrOut)
}
}
}
You can use cpdf https://community.coherentpdf.com to losslessly squeeze the PDF files afterward. This will reconcile any identical object and common parts, and remove any unneeded parts.
From the command line
cpdf -squeeze in.pdf -o out.pdf
Or, from Java:
jcpdf.squeezeInMemory(pdf);
I am facing a problem in finding a way using java through which I can locate location points(Lat,Long) from large CSV file to google map.
I was able to read the locations data from the big dataset but I face a problem in placing the points in streaming way to google maps.
I am not expert in programming but I started with below code:
JFrame test = new JFrame("Google Maps");
try {
String latitude = "45.714728";
String longitude = "-73.998672";
String imageUrl = "https://maps.googleapis.com/maps/api/staticmap?center="+ latitude+ ","+ longitude+ "&zoom=11&size=612x612&scale=2&maptype=roadmap";
String destinationFile = "image.jpg";
// read the map image from Google
// then save it to a local file: image.jpg
URL url = new URL(imageUrl);
InputStream is = url.openStream();
OutputStream os = new FileOutputStream(destinationFile);
byte[] b = new byte[2048];
int length;
while ((length = is.read(b)) != -1) {
os.write(b, 0, length);
}
is.close();
os.close();
} catch (IOException e) {
e.printStackTrace();
System.exit(1);
}
// create a GUI component that loads the image: image.jpg
ImageIcon imageIcon = new ImageIcon((new ImageIcon("image.jpg"))
.getImage().getScaledInstance(630, 600,
java.awt.Image.SCALE_SMOOTH));
test.add(new JLabel(imageIcon));
// show the GUI window
test.setVisible(true);
test.pack();
appreciated any help,
Thanks in advance
The dataset being huge, as you have mentioned, you shouldn't be going for URL based solution. URL has a character limit and hence beyond a point your solution won't work. You should look for some api to get you plotted locations. Check this.
If the number of points you want to plot is relatively less you can use following approach.
String center = centerLat + "," + centerLong;
String points[] = {lat1+","+long1, lat2+","+long2};//points from csv
String plottedPoints = new String();
for(String point: points) {
plottedPoints = plottedPoints + point + "|";
}
//Finally construct the url
String imageUrl = "http://maps.google.com/maps/api/staticmap?center=" + center + "&size=512x512&maptype=roadmap&sensor=false&markers="+plottedPoints;
And then read the resulting image as you have already done.
I have uploaded a video file .mp4(18MB) into gridfs . and trying to read it from java code .here are some points i am unable to move further
1) i can able to retrieve the whole video into byte array and able to play
2) for first Nbytes means starting from first chunk to n no of chunks also i can able to play using directly querying from fs.chunks ... as below and giving to servletOutputstream ..
DBCollection a= db.getCollection("fs.chunks");DBCursor cur1=a.find().limit(10);
System.out.println(cur1);
byte[] destination2 =new byte[2621440];
int length2 = 0;
while(cur1.hasNext()) {
byte[] b2 = (byte[]) cur1.next().get("data");
System.arraycopy(b2, 0, destination2, length2, b2.length);
length2 += b2.length;
System.out.println("##########");
System.out.println(destination2.length);
}
3) I was stuck here, while reading from middle of the chunks , means after skip(n) chunks in the find() operation , unable to play the video by windows media player.saying unable to codec and etc error.. am i trying in a right way ?
DBCollection a= db.getCollection("fs.chunks");
DBCursor cur1=a.find(new BasicDBObject("n",new BasicDBObject("$gt",9))).limit(10);
System.out.println(cur1);
byte[] destination2 =new byte[2621440];
int length2 = 0;
while(cur1.hasNext()) {
byte[] b2 = (byte[]) cur1.next().get("data");
System.arraycopy(b2, 0, destination2, length2, b2.length);
length2 += b2.length;
System.out.println("##########");
System.out.println(destination2.length);
}
...........
public void showVideos(Model model,HttpServletResponse response) throws IOException {............response.setHeader("Content-Type", "video/quicktime");
response.setHeader("Content-Disposition", "inline; filename=\"" + filename + "\"");//byte[] bytearray =destination2
//response.s
ServletOutputStream out = response.getOutputStream();
System.out.println("hello");
int n=0;
//while(is.read(bytes, 0, 4096) != -1)
{
System.out.println(n++);
out.write(bytearray);
}
please suggest me for retrieving the part of the video file and play it from grid fs?
I'd use the GridFS classes for this purpose. Pseudo code below. myFS points to the bucket and findOne looks for the id of the file.
GridFS myFS = null;
if (bucket.isPresent()) {
myFS = new GridFS(m.getDb(), bucket.get());
} else {
myFS = new GridFS(m.getDb());
}
return Optional.fromNullable(myFS.findOne(id));
I have some data that I have saved into a file using Matlab. I have saved this data in Matlab as follows:
fwrite(fid,numImg2,'integer*4');
fwrite(fid,y,'integer*4');
fwrite(fid,imgName,'char*1');
fwrite(fid,a,'integer*4');
fwrite(fid,img.imageData,'double');
I read this data back into Matlab using the following code
fread(fid,1,'integer*4');// Returns numImg2
fread(fid,1,'integer*4');// Returns y which is the number of cha rectors in the image name, i use in the following line to read the image name, say for example if the image name is 1.jpg, then using the following will return the image name
fread(fid,5,'char*1');
fread(fid,1);
etc...
I want to be able to read this data on an android phone. This is the code I have at the moment.
DataInputStream ds = new DataInputStream(new FileInputStream(imageFile));
//String line;
// Read the first byte to find out how many images are stored in the file.
int x = 0;
byte numberOfImages;
int numImages = 0;
while(x<1)
{
numberOfImages = ds.readByte();
numImages = (int)numberOfImages;
x++;
}
int lengthName = 0;
String imgName = "";
for(int y=1; y<=numImages; y++)
{
lengthName = ds.readInt();
byte[] nameBuffer = new byte[lengthName];
char[] name = new char[lengthName];
for(int z = 1; z<=5;z++)
{
nameBuffer[z-1] = ds.readByte();
//name[z-1] = ds.readChar();
}
imgName = new String(nameBuffer);
//imgName = name.toString();
}
text.append(imgName);
I cannot seem to retrieve the image name as a string from the binary file data. Any help is much appreciated.
I'm not sure it will work but anyway:
byte[] nameBuffer = new byte[lengthName];
if(ds.read(nameBuffer) != lengthName) {
// error handling here
return;
}
imgName = new String(nameBuffer, "ISO-8859-1");
I'm trying to write a simple RTF document pretty much from scratch in Java, and I'm trying to embed JPEGs in the document. Here's an example of a JPEG (a 2x2-pixel JPEG consisting of three white pixels and a black pixel in the upper left, if you're curious) embedded in an RTF document (generated by WordPad, which converted the JPEG to WMF):
{\pict\wmetafile8\picw53\pich53\picwgoal30\pichgoal30
0100090000036e00000000004500000000000400000003010800050000000b0200000000050000
000c0202000200030000001e000400000007010400040000000701040045000000410b2000cc00
020002000000000002000200000000002800000002000000020000000100040000000000000000
000000000000000000000000000000000000000000ffffff00fefefe0000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000
0000001202af0801010000040000002701ffff030000000000
}
I've been reading the RTF specification, and it looks like you can specify that the image is a JPEG, but since WordPad always converts images to WMF, I can't see an example of an embedded JPEG. So I may also end up needing to transcode from JPEG to WMF or something....
But basically, I'm looking for how to generate the binary or hexadecimal (Spec, p.148: "These pictures can be in hexadecimal (the default) or binary format.") form of a JPEG given a file URL.
Thanks!
EDIT: I have the stream stuff working all right, I think, but still don't understand exactly how to encode it, because whatever I'm doing, it's not RTF-readable. E.g., the above picture instead comes out as:
ffd8ffe00104a464946011106006000ffdb0430211211222222223533333644357677767789b988a877adaabcccc79efdcebcccffdb04312223336336c878ccccccccccccccccccccccccccccccccccccccccccccccccccffc0011802023122021113111ffc401f001511111100000000123456789abffc40b5100213324355440017d123041151221314161351617227114328191a182342b1c11552d1f024336272829a161718191a25262728292a3435363738393a434445464748494a535455565758595a636465666768696a737475767778797a838485868788898a92939495969798999aa2a3a4a5a6a7a8a9aab2b3b4b5b6b7b8b9bac2c3c4c5c6c7c8c9cad2d3d4d5d6d7d8d9dae1e2e3e4e5e6e7e8e9eaf1f2f3f4f5f6f7f8f9faffc401f103111111111000000123456789abffc40b51102124434754401277012311452131612415176171132232818144291a1b1c19233352f0156272d1a162434e125f11718191a262728292a35363738393a434445464748494a535455565758595a636465666768696a737475767778797a82838485868788898a92939495969798999aa2a3a4a5a6a7a8a9aab2b3b4b5b6b7b8b9bac2c3c4c5c6c7c8c9cad2d3d4d5d6d7d8d9dae2e3e4e5e6e7e8e9eaf2f3f4f5f6f7f8f9faffda0c31021131103f0fdecf09f84f4af178574cd0b42d334fd1744d16d22bd3f4fb0b74b6b5bb78902450c512091c688aaaa8a0500014514507ffd9
This PHP library would do the trick, so I'm trying to port the relevant portion to Java. Here is is:
$imageData = file_get_contents($this->_file);
$size = filesize($this->_file);
$hexString = '';
for ($i = 0; $i < $size; $i++) {
$hex = dechex(ord($imageData{$i}));
if (strlen($hex) == 1) {
$hex = '0' . $hex;
}
$hexString .= $hex;
}
return $hexString;
But I don't know what the Java analogue to dechex(ord($imageData{$i})) is. :( I got only as far as the Integer.toHexString() function, which takes care of the dechex part....
Thanks all. :)
Given a file URL for any file you can get the corresponding bytes by doing (exception handling omitted for brevity)...
int BUF_SIZE = 512;
URL fileURL = new URL("http://www.somewhere.com/someurl.jpg");
InputStream inputStream = fileURL.openStream();
byte [] smallBuffer = new byte[BUF_SIZE];
ByteArrayOutputStream largeBuffer = new ByteArrayOutputStream();
int numRead = BUF_SIZE;
while(numRead == BUF_SIZE) {
numRead = inputStream.read(smallBuffer,0,BUF_SIZE);
if(numRead > 0) {
largeBuffer.write(smallBuffer,0,BUF_SIZE);
}
}
byte [] bytes = largeBuffer.toByteArray();
I'm looking at your PHP snippet now and realizing that RTF is a bizarre specification! It looks like each byte of the image is encoded as 2 hex digits (which doubles the size of the image for no apparent reason). The the entire thing is stored in raw ASCII encoding. So, you'll want to do...
StringBuilder hexStringBuilder = new StringBuilder(bytes.length * 2);
for(byte imageByte : bytes) {
String hexByteString = Integer.toHexString(0x000000FF & (int)imageByte);
if(hexByteString .size() == 1) {
hexByteString = "0" + hexByteString ;
}
hexStringBuilder.append(hexByteString);
}
String hexString = hexStringBuilder.toString();
byte [] hexBytes = hexString.getBytes("UTF-8"); //Could also use US-ASCII
EDIT: Updated code sample to pad 0's on the hex bytes
EDIT: negative bytes were getting logically right shifted when converted to ints >_<
https://joseluisbz.wordpress.com/2013/07/26/exploring-a-wmf-file-0x000900/
Maybe help you this:
String HexRTFBytes = "Representations text of bytes from Image RTF File";
String Destiny = "The path of the output File";
FileOutputStream wmf;
try {
wmf = new FileOutputStream(Destiny);
HexRTFBytes = HexRTFBytes.replaceAll("\n", ""); //Erase New Lines
HexRTFBytes = HexRTFBytes.replaceAll(" ", ""); //Erase Blank spaces
int NumBytesWrite = HexRTFBytes.length();
int WMFBytes = NumBytesWrite/2;//One byte is represented by 2 characters
byte[] ByteWrite = new byte[WMFBytes];
for (int i = 0; i < WMFBytes; i++){
se = HexRTFBytes.substring(i*2,i*2+2);
int Entero = Integer.parseInt(se,16);
ByteWrite[i] = (byte)Entero;
}
wmf.write(ByteWrite);
wmf.close();
}
catch (FileNotFoundException fnfe)
{System.out.println(fnfe.toString());}
catch (NumberFormatException fnfe)
{System.out.println(fnfe.toString());}
catch (EOFException eofe)
{System.out.println(eofe.toString());}
catch (IOException ioe)
{System.out.println(ioe.toString());}
This code take the representation in one string, and result is stored in a file.
https://joseluisbz.wordpress.com/2011/06/22/script-de-clases-rtf-para-jsp-y-php/
Now if you want to obtain the representation of the image file, you can use this:
private void ByteStreamImageString(byte[] ByteStream) {
this.Format = 0;
this.High = 0;
this.Wide = 0;
this.HexImageString = "Error";
if (ByteStream[0]== (byte)137 && ByteStream[1]== (byte)80 && ByteStream[2]== (byte)78){
this.Format = PNG; //PNG
this.High = this.Byte2PosInt(ByteStream[22],ByteStream[23]);
this.Wide = this.Byte2PosInt(ByteStream[18],ByteStream[19]);
}
if (ByteStream[0]== (byte)255 && ByteStream[1]== (byte)216
&& ByteStream[2]== (byte)255 && ByteStream[3]== (byte)224){
this.Format = JPG; //JPG
int PosJPG = 2;
while (PosJPG < ByteStream.length){
String M = String.format("%02X%02X", ByteStream[PosJPG+0],ByteStream[PosJPG+1]);
if (M.equals("FFC0") || M.equals("FFC1") || M.equals("FFC2") || M.equals("FFC3")){
this.High = this.Byte2PosInt(ByteStream[PosJPG+5],ByteStream[PosJPG+6]);
this.Wide = this.Byte2PosInt(ByteStream[PosJPG+7],ByteStream[PosJPG+8]);
}
if (M.equals("FFDA")) {
break;
}
PosJPG = PosJPG+2+this.Byte2PosInt(ByteStream[PosJPG+2],ByteStream[PosJPG+3]);
}
}
if (this.Format > 0) {
this.HexImageString = "";
int Salto = 0;
for (int i=0;i < ByteStream.length; i++){
Salto++;
this.HexImageString += String.format("%02x", ByteStream[i]);
if (Salto==64){
this.HexImageString += "\n"; //To make readable
Salto = 0;
}
}
}
}