How do we create the partitions in hive with the Spaces? - java

We want to create the partitions in hive table, but the partition name have some spaces. So it cant create the partitions. Currently we are using the java.
We tried to escape the space but all are throwing exception.
URL -s3n://comp-data-bckp/data/datav/sample_test/sample_test_inner/2016-06-27/hive/warehouse/datav/sample_test_inner/platform=SONY PS3
HiveConf hiveConf= new HiveConf();
hiveConf.setVar(HiveConf.ConfVars.METASTOREURIS, URL);
hiveConf.setIntVar(HiveConf.ConfVars.METASTORE_CLIENT_SOCKET_TIMEOUT, 60);
CliSessionState css = new CliSessionState(hiveConf);
css.in = System.in;
css.out = null;
css.err = null;
css.setIsSilent(true);
SessionState.start(css);
CliDriver cli = new CliDriver();
int response = cli.processLine(statements);
SessionState s = SessionState.get();
if (s != null && s.out != null && s.out != System.out)
{
s.out.close();
}
return response;
java.net.URISyntaxException: Illegal character in path at index 126: s3n://comp-data-bckp/data/datav/sample_test/sample_test_inner/2016-06-27/hive/warehouse/datav/sample_test_inner/platform=SONY PS3
When we try to escape the space character with below options system still throws the above error.
- \\ (eg. S3n://…./ platform=SONY\\ PS3)
- %20 (eg. S3n://…./ platform=SONY%20PS3)
- + (eg. S3n://…./ platform=SONY+PS3)
Please assist if there are any options to escape it and provide inputs for proceeding further.

Related

ComboAnalyzer - AttributeImpl not found in AttributeSource

The analyzer OpenNLPAnalyzer based on OpenNLPTokenizer in the opennlp package that ships with Lucene in this blog post works as promised. I am now trying to use it inside an ComboAnalyzer (a part of an ES-plugin to combine multiple analyzers; see link below) in the following way:
ComboAnalyzer analyzer = new ComboAnalyzer(new EnglishAnalyzer(), new OpenNLPAnalyzer());
TokenStream stream = analyzer.tokenStream("fieldname", new StringReader(text));
stream is a ComboTokenStream. On calling stream.incrementToken(), I get the following exception at line 105 here:
Exception in thread "main": State contains AttributeImpl of type org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl that is not in in this AttributeSource
Here is what the called method restoreState does.
public final void restoreState(State state) {
if (state == null) return;
do {
AttributeImpl targetImpl = attributeImpls.get(state.attribute.getClass());
if (targetImpl == null) {
throw new IllegalArgumentException("State contains AttributeImpl of type " +
state.attribute.getClass().getName() + " that is not in in this AttributeSource");
}
state.attribute.copyTo(targetImpl);
state = state.next;
} while (state != null);
}
This hints that one of the TokenStreams has an OffsetAttribute but the other does not. Is there a clean way to fix this?
I tried to add the line addAttribute(OffsetAttribute.class) in the same file here. I still get the same exception.
The problem was here:
Tokenizer source = new OpenNLPTokenizer(
AttributeFactory.DEFAULT_ATTRIBUTE_FACTORY, sentenceDetectorOp, tokenizerOp);
The fix is to pass in TokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY instead of AttributeFactory.DEFAULT_ATTRIBUTE_FACTORY. The former uses PackedTokenAttributeImpl for implementing OffsetAttribute (and many other attributes) and the latter picks OffsetAttributeImpl.

java.lang.NullPointerException: lock == null from InputStreamReader

So I'm trying to parse an .obj wavefront file to be displayed with OpenGL ES, thing is, I'm getting the Nullpointer as if the file did not exist or was empty (?).
I tried two different ways of getting to parse the file, also made sure there were no empty lines on it, put it in different folders (assets, src root, res, etc...) but the result is the same. Maybe the error I'm getting is more to do with the OpenGL part of the code? But I'm kinda lost, because apparently it should work...
Also tried buffering the file outside the function, same happened. From another question here, the problem the person had, had to do with " trying to update UI from worker Thread ". Async did not help me here.
I got the code idea form this blog: http://etcodehome.blogspot.com/2011/07/android-rendering-3d-blender-models.html
And the file to base my work on from here: https://github.com/MartianIsMe/earth-live-wallpaper/blob/d71902aa642bad0c10fc46d6839ced6e15995f7b/%20earth-live-wallpaper/SLWP/src/com/seb/SLWP/DeathStar.java
fun loadObjFile() {
try {
var str: String
var tmp: Array<String>
var ftmp: Array<String>
var v: Float
val vlist = ArrayList<Float>()
val nlist = ArrayList<Float>()
val fplist = ArrayList<Fp>()
val mContext: Context? = null
//val inb: BufferedReader = File("androidmodel.obj").bufferedReader()
//val inputString = inb.use { it.readText() }
val inb = BufferedReader(InputStreamReader(mContext?.getAssets()?.open
("src/main/res/androidmodel.obj")), 1024) //Error is here at com.example.xxx.MyGLRenderer.loadObjFile
while (inb.readLine().also { str = it } != null) {
tmp = str.split(" ".toRegex()).toTypedArray()
//Parse the vertices
if (tmp[0].equals("v", ignoreCase = true)) {
for (i in 1..3) {
v = tmp[i].toFloat()
vlist.add(v)
}
}
//Parse the vertex normals
if (tmp[0].equals("vn", ignoreCase = true)) {
for (i in 1..3) {
v = tmp[i].toFloat()
nlist.add(v)
}
}
//Parse the faces/indices
if (tmp[0].equals("f", ignoreCase = true)) {
for (i in 1..3) {
ftmp = tmp[i].split("/".toRegex()).toTypedArray()
val chi = ftmp[0].toInt() - 1.toLong()
var cht = 0
if (ftmp[1] != "") cht = ftmp[1].toInt() - 1
val chn = ftmp[2].toInt() - 1
fplist.add(Fp(chi, cht, chn))
}
NBFACES++
}
}
val vbb = ByteBuffer.allocateDirect(fplist.size * 4 * 3)
vbb.order(ByteOrder.nativeOrder())
mVertexBuffer = vbb.asFloatBuffer()
val nbb = ByteBuffer.allocateDirect(fplist.size * 4 * 3)
nbb.order(ByteOrder.nativeOrder())
mNormBuffer = nbb.asFloatBuffer()
for (j in fplist.indices) {
mVertexBuffer?.put(vlist[(fplist[j].Vi * 3).toInt()])
mVertexBuffer?.put(vlist[(fplist[j].Vi * 3 + 1).toInt()])
mVertexBuffer?.put(vlist[(fplist[j].Vi * 3 + 2).toInt()])
mNormBuffer?.put(nlist[fplist[j].Ni * 3])
mNormBuffer?.put(nlist[fplist[j].Ni * 3 + 1])
mNormBuffer?.put(nlist[fplist[j].Ni * 3 + 2])
}
mIndexBuffer = CharBuffer.allocate(fplist.size)
for (j in fplist.indices) {
mIndexBuffer?.put(j.toChar())
}
mVertexBuffer?.position(0)
mNormBuffer?.position(0)
mIndexBuffer?.position(0)
} catch (e: IOException) {
e.printStackTrace()
}
}
private class Fp
(var Vi: Long, var Ti: Int, var Ni: Int)
The problem is that you pass null into InputStreamReader. The path to the asset is wrong.
First of all the file should be located under assets directory that is positioned on the same level in directory hierarchy as the java and res folder.
Second, you should pass path relative to the assets directory. So if your file is located directly under assets then the relative path is "androidmodel.obj". Thus, creating input stream will look like this:
InputStreamReader(mContext?.getAssets()?.open("androidmodel.obj"))
But I strongly recommend you to check for non-null because if mContext is null - the issue will return.
mContext?.getAssets()?.open("androidmodel.obj")?.let { nonNullAsset ->
InputStreamReader(nonNullAsset)
}
This part is crucial ?.let { as it runs the let function only if the object is not null.
If there is no assets directory, just create it as a simple directory and it will be picked up by IDE automatically:
Update
As the NPE still occurs the only reason left is the null value in mContext variable. Make sure it is initialized.
And after a little bit more digging, I can say that this was the issue from the beginning. Any attempt to pass the wrong path of a file to the assets.open(fileName) function will result in FileNotFoundException. Thus, even though the path you use is wrong you did not even reach the point of opening a file as the context is null.

Redisson: Not able to set address in SingleServer mode

I am using the single server mode to configure the redis server and port, am I missing something here ?
Config config = new Config();
config.useSingleServer().setAddress("localhost:6379");
But below exception is encountered
Exception in thread "main" java.lang.IllegalArgumentException: Illegal character in scheme name at index 0: [localhost]:6379
at java.net.URI.create(URI.java:852)
at org.redisson.misc.URIBuilder.create(URIBuilder.java:38)
at org.redisson.config.SingleServerConfig.setAddress(SingleServerConfig.java:129)
Seems the below code in org.redisson.misc.URIBuilder has issue
public static URI create(String uri) {
URI u = URI.create(uri);
// Let's assuming most of the time it is OK.
if (u.getHost() != null) {
return u;
}
String s = uri.substring(0, uri.lastIndexOf(":")).replaceFirst("redis://", "").replaceFirst("rediss://", "");
// Assuming this is an IPv6 format, other situations will be handled by
// Netty at a later stage.
return URI.create(uri.replace(s, "[" + s + "]"));
}
Managed to get it working by using the following configuration
Config config = new Config();
config.useSingleServer().setAddress("redis://localhost:6379");

Migration from dcm4che2 to dcm4che3

I have used below mentioned API of dcm4che2 from this repository http://www.dcm4che.org/maven2/dcm4che/ in my java project.
dcm4che-core-2.0.29.jar
org.dcm4che2.data.DicomObject
org.dcm4che2.io.StopTagInputHandler
org.dcm4che2.data.BasicDicomObject
org.dcm4che2.data.UIDDictionary
org.dcm4che2.data.DicomElement
org.dcm4che2.data.SimpleDcmElement
org.dcm4che2.net.service.StorageCommitmentService
org.dcm4che2.util.CloseUtils
dcm4che-net-2.0.29.jar
org.dcm4che2.net.CommandUtils
org.dcm4che2.net.ConfigurationException
org.dcm4che2.net.NetworkApplicationEntity
org.dcm4che2.net.NetworkConnection
org.dcm4che2.net.NewThreadExecutor
org.dcm4che3.net.service.StorageService
org.dcm4che3.net.service.VerificationService
Currently i want to migrate to dcm4che3 but, above listed API is not found in dcm4che3 which i have downloaded from this repository http://sourceforge.net/projects/dcm4che/files/dcm4che3/
Could you please guide me for alternate approach?
As you have already observed, the BasicDicomObject is history -- alongside quite a few others.
The new "Dicom object" is Attributes -- an object is a collection of attributes.
Therefore, you create Attributes, populate them with the tags you need for RQ-behaviour (C-FIND, etc) and what you get in return is another Attributes object from which you pull the tags you want.
In my opinion, dcm4che 2.x was vague on the subject of dealing with individual value representations. dcm4che 3.x is quite a bit clearer.
The migration demands a rewrite of your code regarding how you query and how you treat individual tags. On the other hand, dcm4che 3.x makes the new code less convoluted.
On request, I have added the initial setup of a connection to some service class provider (SCP):
// Based on org.dcm4che:dcm4che-core:5.25.0 and org.dcm4che:dcm4che-net:5.25.0
import org.dcm4che3.data.*;
import org.dcm4che3.net.*;
import org.dcm4che3.net.pdu.AAssociateRQ;
import org.dcm4che3.net.pdu.PresentationContext;
import org.dcm4che3.net.pdu.RoleSelection;
import org.dcm4che3.net.pdu.UserIdentityRQ;
// Client side representation of the connection. As a client, I will
// not be listening for incoming traffic (but I could choose to do so
// if I need to transfer data via MOVE)
Connection local = new Connection();
local.setHostname("client.on.network.com");
local.setPort(Connection.NOT_LISTENING);
// Remote side representation of the connection
Connection remote = new Connection();
remote.setHostname("pacs.on.network.com");
remote.setPort(4100);
remote.setTlsProtocols(local.getTlsProtocols());
remote.setTlsCipherSuites(local.getTlsCipherSuites());
// Calling application entity
ApplicationEntity ae = new ApplicationEntity("MeAsAServiceClassUser".toUpperCase());
ae.setAETitle("MeAsAServiceClassUser");
ae.addConnection(local); // on which we may not be listening
ae.setAssociationInitiator(true);
ae.setAssociationAcceptor(false);
// Device
Device device = new Device("MeAsAServiceClassUser".toLowerCase());
device.addConnection(local);
device.addApplicationEntity(ae);
// Configure association
AAssociateRQ rq = new AAssociateRQ();
rq.setCallingAET("MeAsAServiceClassUser");
rq.setCalledAET("NameThatIdentifiesTheProvider"); // e.g. "GEPACS"
rq.setImplVersionName("MY-SCU-1.0"); // Max 16 chars
// Credentials (if appropriate)
String username = "username";
String passcode = "so secret";
if (null != username && username.length() > 0 && null != passcode && passcode.length() > 0) {
rq.setUserIdentityRQ(UserIdentityRQ.usernamePasscode(username, passcode.toCharArray(), true));
}
Example, pinging the PACS (using the setup above):
String[] TRANSFER_SYNTAX_CHAIN = {
UID.ExplicitVRLittleEndian,
UID.ImplicitVRLittleEndian
};
// Define transfer capabilities for verification SOP class
ae.addTransferCapability(
new TransferCapability(null,
/* SOP Class */ UID.Verification,
/* Role */ TransferCapability.Role.SCU,
/* Transfer syntax */ TRANSFER_SYNTAX_CHAIN)
);
// Setup presentation context
rq.addPresentationContext(
new PresentationContext(
rq.getNumberOfPresentationContexts() * 2 + 1,
/* abstract syntax */ UID.Verification,
/* transfer syntax */ TRANSFER_SYNTAX_CHAIN
)
);
rq.addRoleSelection(new RoleSelection(UID.Verification, /* is SCU? */ true, /* is SCP? */ false));
try {
// 1) Open a connection to the SCP
Association association = ae.connect(local, remote, rq);
// 2) PING!
DimseRSP rsp = association.cecho();
rsp.next(); // Consume reply, which may fail
// Still here? Success!
// 3) Close the connection to the SCP
if (as.isReadyForDataTransfer()) {
as.waitForOutstandingRSP();
as.release();
}
} catch (Throwable ignore) {
// Failure
}
Another example, retrieving studies from a PACS given accession numbers; setting up the query and handling the result:
String modality = null; // e.g. "OT"
String accessionNumber = "1234567890";
//--------------------------------------------------------
// HERE follows setup of a query, using an Attributes object
//--------------------------------------------------------
Attributes query = new Attributes();
// Indicate character set
{
int tag = Tag.SpecificCharacterSet;
VR vr = ElementDictionary.vrOf(tag, query.getPrivateCreator(tag));
query.setString(tag, vr, "ISO_IR 100");
}
// Study level query
{
int tag = Tag.QueryRetrieveLevel;
VR vr = ElementDictionary.vrOf(tag, query.getPrivateCreator(tag));
query.setString(tag, vr, "STUDY");
}
// Accession number
{
int tag = Tag.AccessionNumber;
VR vr = ElementDictionary.vrOf(tag, query.getPrivateCreator(tag));
query.setString(tag, vr, accessionNumber);
}
// Optionally filter on modality in study if 'modality' is provided,
// otherwise retrieve modality
{
int tag = Tag.ModalitiesInStudy;
VR vr = ElementDictionary.vrOf(tag, query.getPrivateCreator(tag));
if (null != modality && modality.length() > 0) {
query.setString(tag, vr, modality);
} else {
query.setNull(tag, vr);
}
}
// We are interested in study instance UID
{
int tag = Tag.StudyInstanceUID;
VR vr = ElementDictionary.vrOf(tag, query.getPrivateCreator(tag));
query.setNull(tag, vr);
}
// Do the actual query, needing an AppliationEntity (ae),
// a local (local) and remote (remote) Connection, and
// an AAssociateRQ (rq) set up earlier.
try {
// 1) Open a connection to the SCP
Association as = ae.connect(local, remote, rq);
// 2) Query
int priority = 0x0002; // low for the sake of demo :)
as.cfind(UID.StudyRootQueryRetrieveInformationModelFind, priority, query, null,
new DimseRSPHandler(as.nextMessageID()) {
#Override
public void onDimseRSP(Association assoc, Attributes cmd,
Attributes response) {
super.onDimseRSP(assoc, cmd, response);
int status = cmd.getInt(Tag.Status, -1);
if (Status.isPending(status)) {
//--------------------------------------------------------
// HERE follows handling of the response, which
// is just another Attributes object
//--------------------------------------------------------
String studyInstanceUID = response.getString(Tag.StudyInstanceUID);
// etc...
}
}
});
// 3) Close the connection to the SCP
if (as.isReadyForDataTransfer()) {
as.waitForOutstandingRSP();
as.release();
}
}
catch (Exception e) {
// Failure
}
More on this at https://github.com/FrodeRanders/dicom-tools

Direct download from Google Drive using Google Drive API

My desktop application, written in java, tries to download public files from Google Drive. As i found out, it can be implemented by using file's webContentLink (it's for ability to download public files without user authorization).
So, the code below works with small files:
String webContentLink = aFile.getWebContentLink();
InputStream in = new URL(webContentLink).openStream();
But it doesn't work on big files, because in this case file can't be downloaded directly via webContentLink without user confirmation with google virus scan warning. See an example: web content link.
So my question is how to get content of a public file from Google Drive without user authorization?
Update December 8th, 2015
According to Google Support using the
googledrive.com/host/ID
method will be turned off on Aug 31st, 2016.
I just ran into this issue.
The trick is to treat your Google Drive folder like a web host.
Update April 1st, 2015
Google Drive has changed and there's a simple way to direct link to your drive. I left my previous answers below for reference but to here's an updated answer.
Create a Public folder in Google Drive.
Share this drive publicly.
Get your Folder UUID from the address bar when you're in that folder
Put that UUID in this URL
https://googledrive.com/host/<folder UUID>/
Add the file name to where your file is located.
https://googledrive.com/host/<folder UUID>/<file name>
Which is intended functionality by Google
new Google Drive Link.
All you have to do is simple get the host URL for a publicly shared drive folder. To do this, you can upload a plain HTML file and preview it in Google Drive to find your host URL.
Here are the steps:
Create a folder in Google Drive.
Share this drive publicly.
Upload a simple HTML file. Add any additional files (subfolders ok)
Open and "preview" the HTML file in Google Drive
Get the URL address for this folder
Create a direct link URL from your URL folder base
This URL should allow direct downloads of your large files.
[edit]
I forgot to add. If you use subfolders to organize your files, you simple use the folder name as you would expect in a URL hierarchy.
https://googledrive.com/host/<your public folders id string>/images/my-image.png
What I was looking to do
I created a custom Debian image with Virtual Box for Vagrant. I wanted to share this ".box" file with colleagues so they could put the direct link into their Vagrantfile.
In the end, I needed a direct link to the actual file.
Google Drive problem
If you set the file permissions to be publicly available and create/generate a direct access link by using something like the gdocs2direct tool or just crafting the link yourself:
https://docs.google.com/uc?export=download&id=<your file id>
You will get a cookie based verification code and prompt "Google could not scan this file" prompt, which won't work for things such as wget or Vagrantfile configs.
The code that it generates is a simple code that appends GET query variable ...&confirm=### to the string, but it's per user specific, so it's not like you can copy/paste that query variable for others.
But if you use the above "Web page hosting" method, you can get around that prompt.
I hope that helps!
If you face the "This file cannot be checked for viruses" intermezzo page, the download is not that easy.
You essentially need to first download the normal download link, which however redirects you to the "Download anyway" page. You need to store cookies from this first request, find out the link pointed to by the "Download anyway" button, and then use this link to download the file, but reusing the cookies you got from the first request.
Here's a bash variant of the download process using CURL:
curl -c /tmp/cookies "https://drive.google.com/uc?export=download&id=DOCUMENT_ID" > /tmp/intermezzo.html
curl -L -b /tmp/cookies "https://drive.google.com$(cat /tmp/intermezzo.html | grep -Po 'uc-download-link" [^>]* href="\K[^"]*' | sed 's/\&/\&/g')" > FINAL_DOWNLOADED_FILENAME
Notes:
this procedure will probably stop working after some Google changes
the grep command uses Perl syntax (-P) and the \K "operator" which essentially means "do not include anything preceding \K to the matched result. I don't know which version of grep introduced these options, but ancient or non-Ubuntu versions probably don't have it
a Java solution would be more or less the same, just take a HTTPS library which can handle cookies, and some nice text-parsing library
I know this is an old question but I could not find a solution to this problem after some research, so I am sharing what worked for me.
I have written this C# code for one of my projects. It can bypass the scan virus warning programmatically. The code can probably be converted to Java.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.IO;
using System.Net;
using System.Text;
public class FileDownloader : IDisposable
{
private const string GOOGLE_DRIVE_DOMAIN = "drive.google.com";
private const string GOOGLE_DRIVE_DOMAIN2 = "https://drive.google.com";
// In the worst case, it is necessary to send 3 download requests to the Drive address
// 1. an NID cookie is returned instead of a download_warning cookie
// 2. download_warning cookie returned
// 3. the actual file is downloaded
private const int GOOGLE_DRIVE_MAX_DOWNLOAD_ATTEMPT = 3;
public delegate void DownloadProgressChangedEventHandler( object sender, DownloadProgress progress );
// Custom download progress reporting (needed for Google Drive)
public class DownloadProgress
{
public long BytesReceived, TotalBytesToReceive;
public object UserState;
public int ProgressPercentage
{
get
{
if( TotalBytesToReceive > 0L )
return (int) ( ( (double) BytesReceived / TotalBytesToReceive ) * 100 );
return 0;
}
}
}
// Web client that preserves cookies (needed for Google Drive)
private class CookieAwareWebClient : WebClient
{
private class CookieContainer
{
private readonly Dictionary<string, string> cookies = new Dictionary<string, string>();
public string this[Uri address]
{
get
{
string cookie;
if( cookies.TryGetValue( address.Host, out cookie ) )
return cookie;
return null;
}
set
{
cookies[address.Host] = value;
}
}
}
private readonly CookieContainer cookies = new CookieContainer();
public DownloadProgress ContentRangeTarget;
protected override WebRequest GetWebRequest( Uri address )
{
WebRequest request = base.GetWebRequest( address );
if( request is HttpWebRequest )
{
string cookie = cookies[address];
if( cookie != null )
( (HttpWebRequest) request ).Headers.Set( "cookie", cookie );
if( ContentRangeTarget != null )
( (HttpWebRequest) request ).AddRange( 0 );
}
return request;
}
protected override WebResponse GetWebResponse( WebRequest request, IAsyncResult result )
{
return ProcessResponse( base.GetWebResponse( request, result ) );
}
protected override WebResponse GetWebResponse( WebRequest request )
{
return ProcessResponse( base.GetWebResponse( request ) );
}
private WebResponse ProcessResponse( WebResponse response )
{
string[] cookies = response.Headers.GetValues( "Set-Cookie" );
if( cookies != null && cookies.Length > 0 )
{
int length = 0;
for( int i = 0; i < cookies.Length; i++ )
length += cookies[i].Length;
StringBuilder cookie = new StringBuilder( length );
for( int i = 0; i < cookies.Length; i++ )
cookie.Append( cookies[i] );
this.cookies[response.ResponseUri] = cookie.ToString();
}
if( ContentRangeTarget != null )
{
string[] rangeLengthHeader = response.Headers.GetValues( "Content-Range" );
if( rangeLengthHeader != null && rangeLengthHeader.Length > 0 )
{
int splitIndex = rangeLengthHeader[0].LastIndexOf( '/' );
if( splitIndex >= 0 && splitIndex < rangeLengthHeader[0].Length - 1 )
{
long length;
if( long.TryParse( rangeLengthHeader[0].Substring( splitIndex + 1 ), out length ) )
ContentRangeTarget.TotalBytesToReceive = length;
}
}
}
return response;
}
}
private readonly CookieAwareWebClient webClient;
private readonly DownloadProgress downloadProgress;
private Uri downloadAddress;
private string downloadPath;
private bool asyncDownload;
private object userToken;
private bool downloadingDriveFile;
private int driveDownloadAttempt;
public event DownloadProgressChangedEventHandler DownloadProgressChanged;
public event AsyncCompletedEventHandler DownloadFileCompleted;
public FileDownloader()
{
webClient = new CookieAwareWebClient();
webClient.DownloadProgressChanged += DownloadProgressChangedCallback;
webClient.DownloadFileCompleted += DownloadFileCompletedCallback;
downloadProgress = new DownloadProgress();
}
public void DownloadFile( string address, string fileName )
{
DownloadFile( address, fileName, false, null );
}
public void DownloadFileAsync( string address, string fileName, object userToken = null )
{
DownloadFile( address, fileName, true, userToken );
}
private void DownloadFile( string address, string fileName, bool asyncDownload, object userToken )
{
downloadingDriveFile = address.StartsWith( GOOGLE_DRIVE_DOMAIN ) || address.StartsWith( GOOGLE_DRIVE_DOMAIN2 );
if( downloadingDriveFile )
{
address = GetGoogleDriveDownloadAddress( address );
driveDownloadAttempt = 1;
webClient.ContentRangeTarget = downloadProgress;
}
else
webClient.ContentRangeTarget = null;
downloadAddress = new Uri( address );
downloadPath = fileName;
downloadProgress.TotalBytesToReceive = -1L;
downloadProgress.UserState = userToken;
this.asyncDownload = asyncDownload;
this.userToken = userToken;
DownloadFileInternal();
}
private void DownloadFileInternal()
{
if( !asyncDownload )
{
webClient.DownloadFile( downloadAddress, downloadPath );
// This callback isn't triggered for synchronous downloads, manually trigger it
DownloadFileCompletedCallback( webClient, new AsyncCompletedEventArgs( null, false, null ) );
}
else if( userToken == null )
webClient.DownloadFileAsync( downloadAddress, downloadPath );
else
webClient.DownloadFileAsync( downloadAddress, downloadPath, userToken );
}
private void DownloadProgressChangedCallback( object sender, DownloadProgressChangedEventArgs e )
{
if( DownloadProgressChanged != null )
{
downloadProgress.BytesReceived = e.BytesReceived;
if( e.TotalBytesToReceive > 0L )
downloadProgress.TotalBytesToReceive = e.TotalBytesToReceive;
DownloadProgressChanged( this, downloadProgress );
}
}
private void DownloadFileCompletedCallback( object sender, AsyncCompletedEventArgs e )
{
if( !downloadingDriveFile )
{
if( DownloadFileCompleted != null )
DownloadFileCompleted( this, e );
}
else
{
if( driveDownloadAttempt < GOOGLE_DRIVE_MAX_DOWNLOAD_ATTEMPT && !ProcessDriveDownload() )
{
// Try downloading the Drive file again
driveDownloadAttempt++;
DownloadFileInternal();
}
else if( DownloadFileCompleted != null )
DownloadFileCompleted( this, e );
}
}
// Downloading large files from Google Drive prompts a warning screen and requires manual confirmation
// Consider that case and try to confirm the download automatically if warning prompt occurs
// Returns true, if no more download requests are necessary
private bool ProcessDriveDownload()
{
FileInfo downloadedFile = new FileInfo( downloadPath );
if( downloadedFile == null )
return true;
// Confirmation page is around 50KB, shouldn't be larger than 60KB
if( downloadedFile.Length > 60000L )
return true;
// Downloaded file might be the confirmation page, check it
string content;
using( var reader = downloadedFile.OpenText() )
{
// Confirmation page starts with <!DOCTYPE html>, which can be preceeded by a newline
char[] header = new char[20];
int readCount = reader.ReadBlock( header, 0, 20 );
if( readCount < 20 || !( new string( header ).Contains( "<!DOCTYPE html>" ) ) )
return true;
content = reader.ReadToEnd();
}
int linkIndex = content.LastIndexOf( "href=\"/uc?" );
if( linkIndex < 0 )
return true;
linkIndex += 6;
int linkEnd = content.IndexOf( '"', linkIndex );
if( linkEnd < 0 )
return true;
downloadAddress = new Uri( "https://drive.google.com" + content.Substring( linkIndex, linkEnd - linkIndex ).Replace( "&", "&" ) );
return false;
}
// Handles the following formats (links can be preceeded by https://):
// - drive.google.com/open?id=FILEID
// - drive.google.com/file/d/FILEID/view?usp=sharing
// - drive.google.com/uc?id=FILEID&export=download
private string GetGoogleDriveDownloadAddress( string address )
{
int index = address.IndexOf( "id=" );
int closingIndex;
if( index > 0 )
{
index += 3;
closingIndex = address.IndexOf( '&', index );
if( closingIndex < 0 )
closingIndex = address.Length;
}
else
{
index = address.IndexOf( "file/d/" );
if( index < 0 ) // address is not in any of the supported forms
return string.Empty;
index += 7;
closingIndex = address.IndexOf( '/', index );
if( closingIndex < 0 )
{
closingIndex = address.IndexOf( '?', index );
if( closingIndex < 0 )
closingIndex = address.Length;
}
}
return string.Concat( "https://drive.google.com/uc?id=", address.Substring( index, closingIndex - index ), "&export=download" );
}
public void Dispose()
{
webClient.Dispose();
}
}
And here's how you can use it:
// NOTE: FileDownloader is IDisposable!
FileDownloader fileDownloader = new FileDownloader();
// This callback is triggered for DownloadFileAsync only
fileDownloader.DownloadProgressChanged += ( sender, e ) => Console.WriteLine( "Progress changed " + e.BytesReceived + " " + e.TotalBytesToReceive );
// This callback is triggered for both DownloadFile and DownloadFileAsync
fileDownloader.DownloadFileCompleted += ( sender, e ) => Console.WriteLine( "Download completed" );
fileDownloader.DownloadFileAsync( "https://INSERT_DOWNLOAD_LINK_HERE", #"C:\downloadedFile.txt" );
#Case 1: download file with small size.
You can use url with format https://drive.google.com/uc?export=download&id=FILE_ID and then inputstream of file can be obtained directly.
#Case 2: download file with large size.
You stuck a wall of a virus scan alert page returned. By parsing html dom element, I tried to get link with confirm code under button "Download anyway" but it didn't work. Its may required cookie or session info.
enter image description here
SOLUTION:
Finally I found solution for two above cases. Just need to put httpConnection.setDoOutput(true) in connection step to get a Json.
)]}' { "disposition":"SCAN_CLEAN",
"downloadUrl":"http:www...",
"fileName":"exam_list_json.txt", "scanResult":"OK", "sizeBytes":2392}
Then, you can use any Json parser to read downloadUrl, fileName and sizeBytes.
You can refer follow snippet, hope it help.
private InputStream gConnect(String remoteFile) throws IOException{
URL url = new URL(remoteFile);
URLConnection connection = url.openConnection();
if(connection instanceof HttpURLConnection){
HttpURLConnection httpConnection = (HttpURLConnection) connection;
connection.setAllowUserInteraction(false);
httpConnection.setInstanceFollowRedirects(true);
httpConnection.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows 2000)");
httpConnection.setDoOutput(true);
httpConnection.setRequestMethod("GET");
httpConnection.connect();
int reqCode = httpConnection.getResponseCode();
if(reqCode == HttpURLConnection.HTTP_OK){
InputStream is = httpConnection.getInputStream();
Map<String, List<String>> map = httpConnection.getHeaderFields();
List<String> values = map.get("content-type");
if(values != null && !values.isEmpty()){
String type = values.get(0);
if(type.contains("text/html")){
String cookie = httpConnection.getHeaderField("Set-Cookie");
String temp = Constants.getPath(mContext, Constants.PATH_TEMP) + "/temp.html";
if(saveGHtmlFile(is, temp)){
String href = getRealUrl(temp);
if(href != null){
return parseUrl(href, cookie);
}
}
} else if(type.contains("application/json")){
String temp = Constants.getPath(mContext, Constants.PATH_TEMP) + "/temp.txt";
if(saveGJsonFile(is, temp)){
FileDataSet data = JsonReaderHelper.readFileDataset(new File(temp));
if(data.getPath() != null){
return parseUrl(data.getPath());
}
}
}
}
return is;
}
}
return null;
}
And
public static FileDataSet readFileDataset(File file) throws IOException{
FileInputStream is = new FileInputStream(file);
JsonReader reader = new JsonReader(new InputStreamReader(is, "UTF-8"));
reader.beginObject();
FileDataSet rs = new FileDataSet();
while(reader.hasNext()){
String name = reader.nextName();
if(name.equals("downloadUrl")){
rs.setPath(reader.nextString());
} else if(name.equals("fileName")){
rs.setName(reader.nextString());
} else if(name.equals("sizeBytes")){
rs.setSize(reader.nextLong());
} else {
reader.skipValue();
}
}
reader.endObject();
return rs;
}
This seems to be updated again as of May 19, 2015:
How I got it to work:
As in jmbertucci's recently updated answer, make your folder public to everyone. This is a bit more complicated than before, you have to click Advanced to change the folder to "On - Public on the web."
Find your folder UUID as before--just go into the folder and find your UUID in the address bar:
https://drive.google.com/drive/folders/<folder UUID>
Then head to
https://googledrive.com/host/<folder UUID>
It will redirect you to an index type page with a giant subdomain, but you should be able to see the files in your folder. Then you can right click to save the link to the file you want (I noticed that this direct link also has this big subdomain for googledrive.com). Worked great for me with wget.
This also seems to work with others' shared folders.
e.g.,
https://drive.google.com/folderview?id=0B7l10Bj_LprhQnpSRkpGMGV2eE0&usp=sharing
maps to
https://googledrive.com/host/0B7l10Bj_LprhQnpSRkpGMGV2eE0
And a right click can save a direct link to any of those files.
Using a Service Account might work for you.
Check this out:
wget https://raw.githubusercontent.com/circulosmeos/gdown.pl/master/gdown.pl
chmod +x gdown.pl
./gdown.pl https://drive.google.com/file/d/FILE_ID/view TARGET_PATH
Update as of August 2020:
This is what worked for me recently -
Upload your file and get a shareable link which anyone can see(Change permission from "Restricted" to "Anyone with the Link" in the share link options)
Then run:
SHAREABLE_LINK=<google drive shareable link>
curl -L https://drive.google.com/uc\?id\=$(echo $SHAREABLE_LINK | cut -f6 -d"/")
If you just want to programmatically (as oppossed to giving the user a link to open in a browser) download a file through the Google Drive API, I would suggest using the downloadUrl of the file instead of the webContentLink, as documented here: https://developers.google.com/drive/web/manage-downloads
https://github.com/google/skicka
I used this command line tool to download files from Google Drive. Just follow the instructions in Getting Started section and you should download files from Google Drive in minutes.
For any shared link replace FILENAME and FILEID, (for very large files requiring confirmation):
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=FILEID' -O- | sed -rn 's/.confirm=([0-9A-Za-z_]+)./\1\n/p')&id=FILEID" -O FILENAME && rm -rf /tmp/cookies.txt
(For small files):
wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=FILEID' -O FILENAME
I would consider downloading from the link, scraping the page that you get to grab the confirmation link, and then downloading that.
If you look at the "download anyway" URL it has an extra confirm query parameter with a seemingly randomly generated token. Since it's random...and you probably don't want to figure out how to generate it yourself, scraping might be the easiest way without knowing anything about how the site works.
You may need to consider various scenarios.
I simply create a javascript so that it automatically capture the link and download and close the tab with the help of tampermonkey.
// ==UserScript==
// #name Bypass Google drive virus scan
// #namespace SmartManoj
// #version 0.1
// #description Quickly get the download link
// #author SmartManoj
// #match https://drive.google.com/uc?id=*&export=download*
// #grant none
// ==/UserScript==
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
async function demo() {
await sleep(5000);
window.close();
}
(function() {
location.replace(document.getElementById("uc-download-link").href);
demo();
})();
Similarly you can get the html source of the url and download in java.
I faced an issue in direct download because I was logged in using multiple Google accounts.
Solution is append authUser=0 parameter. Sample request URL to download :https://drive.google.com/uc?id=FILEID&authuser=0&export=download
https://drive.google.com/uc?export=download&id=FILE_ID replace the FILE_ID with file id.
if you don't know were is file id then check this article Article LINK

Categories