Finding size of files locally/remotely

Finding size of files locally/remotely - java

I'm writing a program to find the file size of files.
Is it possible in java?
In PHP I know there is a filesize().
Another alternative was using ab http:// in unix but how is it integratabtle with java?
What do you think is the best/most efficient way to attack this?

You can use java Runtime to execute the command and read the output from the buffer and display it.
Runtime rt = Runtime.getRuntime();
Process proc = rt.exec("ab http://whatever ");
// read the stream into the buffer and display the results.
If you have the file locally, then you can use File.length()

Retrieve a page, extract links, and then only request the header for each uri.
filesize() in PHP may be dicey, as whether or not you're allowed to use it on a remote file will be entirely up to the configuration of your host. You might consider curl instead
Using curl from a shell, for instance, to look at an ad on the rhs of the page as I write this:
curl -I http://static.adzerk.net/Advertisers/180414077f314dbdbaa8d8e2f7898249.gif
...yields, among other things:
Content-Type: image/gif
Content-Length: 17798
...which may be what you're looking for. Within PHP, get the equivalent with CURLOPT_NOBODY

Related

Transforming JSON with XSLT using SaxonEE and Python

I am attempting to write a Python script that transforms JSON to a text file (CSV) with XSLT.
With saxon-ee-10.5.jar, I can successfully perform the desired transformation by running the following command (Windows 10):
java -cp saxon-ee-10.5.jar com.saxonica.Transform -it -xsl:styling.xslt -o:result.csv
How can I achieve the same result by using Python? I have been trying with Saxon-EE/C, but I am not sure if what I want to happen is possible.
Here is an example of what I have tried so far. My XSLT already defines an $in parameter for the initial.json file, but the PyXslt30Processor.apply_templates_returning_file() seems to require a call to PyXslt30Processor.set_initial_match_selection(), of which I am not sure if non-XML files can be passed.
from saxonc import PySaxonProcessor
with PySaxonProcessor(license=True) as proc:
xslt30proc = proc.new_xslt30_processor()
xslt30proc.set_initial_match_selection(file_name='initial.json')
content = xslt30proc.apply_templates_returning_file(
stylesheet_file='styling.xslt',
output_file='result.csv'
)
print(content)
Is what I want to accomplish possible with Saxon-EE/C, or should I try techniques of calling Java from Python?

I think you want to use call_template... instead of apply-templates, e.g. https://www.saxonica.com/saxon-c/doc/html/saxonc.html#PyXslt30Processor-call_template_returning_file with
xslt30proc.call_template_returning_file(None, stylesheet_file='styling.xslt',
output_file='result.csv'
)
Using None as the template name should be identical to using -it on the command line, i.e. start by calling the template named xsl:initial-template.
Don't use xslt30proc.set_initial_match_selection in that case.
It might, however, help, to set xslt30proc.set_cwd('.') before the call_template_returning_file call.

Determining GZIPOutputStream behavior

The following code produces files which is deterministic (shasum is the same) for two strings.
try(
FileOutputStream fos = new FileOutputStream(saveLocation);
GZIPOutputStream zip = new GZIPOutputStream(fos, GZIP_BUFFER_SIZE);
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(zip, StandardCharsets.UTF_8));
){
writer.append(str);
}
Produces:
a.gz f0200d53f7f9b35647b5dece0146d72cd1c17949
However, if I take the file on the command line and re-zip it, it produces a different result
> gunzip -n a.gz ;gzip -n a ; shasum a.gz
50f478a9ceb292a2d14f1460d7c584b7a856e4d9 a.gz
How can I get it to match the original sha using /usr/bin/gzip and gunzip ?

I think that the problem is likely to be the Gzip file header.
The Gzip format has provision for including a file name and file timestamp in the file headers. (I see you are using the -n when uncompressing and recompressing ... which is probably correct here.)
The Gzip format also includes an "operating system id" in the header. This is supposed to identify the source file system type; e.g. 0 for FAT, 3 for UNIX, and so on.
Either of these could lead to differences in the Gzip files and hence different hashes.
If I was going to solve this myself, I would start by using cmp to see where the compressed file differences start, and then od to identify what the differences are. Refer to the Gzip file format spec to figure out what the differences mean:
RFC 1952 - GZIP file format specification version 4.3
Wikipedia's gzip page.
How can I get it to match the original SHA using gzip and gunzip ?
Assuming that the difference is the OS id, I don't think there is a practical way to solve this with the gzip and gunzip commands.
I looked at the source code for GZIPOutputStream in Java 11, and it is not promising.
It is hard-wiring the timestamp to zero.
It is hard-wiring the OS identifier to zero (which is supposed to mean FAT).
The hard-wiring is in a private method and would be next to impossible to "fix" by subclassing or reflection. You could copy the code and fix it that way, but then you have to maintain your variant GZIPOutputStream class indefinitely.
(I would be looking at changing the application ... or whatever ... so that I didn't need the checksums to be identical. You haven't said why you are doing this. It is for testing purposes only, try looking for a different way to implement the tests.)

How to pass argument to Java Process?

I have a Java application that creates and runs a process (x.exe). When process x is executed, it prompts the user to enter a password. How to pass this password to the user?
I tried to use the OutputStream of the process and wrote the password to the stream but still the process didn't run.
Runtime r=Runtime.getRuntime();
Process p=r.exec("x.exe");
p.getOutputStream()//.use to pass the arguments

You need to flush the stream, and also, it maybe expects a CR at the end of the password to simulate the ENTER key the user types at the end of the password. This works for me in Linux:
Runtime r = Runtime.getRuntime();
Process p = r.exec("myTestingExe");
p.getOutputStream().write("myPassword\n".getBytes()); // notice the `\n`
p.getOutputStream().flush();
Some caveats:
This works in Linux with '\n' at the end, maybe in Windows you would need \r instead (honestly I'm not sure of how Windows handles the "ENTER" key in the input)
I'm using "myPassword\n".getBytes() but a more complete value would be new String("myPassword".getBytes(), Charset.forName("MyCharsetName")); (where "MyCharsetName" is a supported encoding) if you are using an encoding like "UTF-8".

As already was pointed out you can consider to use an Expect-like library for interacting between your Java program and a spawn OS process. Basically, you would need to wait until the password prompt gets available in the process input stream and then write the password terminated by the end-of-line to the process output stream.
If you decide to go with a third party library approach I'd recommend you to give a try my own modern alternative to expect4j and others. It is called ExpectIt has no dependencies and is Apache licensed.
Here is a possible example with the use of the expect library:
Process process = Runtime.getRuntime().exec("...");
Expect expect = new ExpectBuilder()
.withInputs(process.getInputStream())
.withOutput(process.getOutputStream())
.withErrorOnTimeout(true)
.build();
expect.expect(contains("Password:"));
expect.sendLine("secret");
expect.close();
Note: the contains method is statically imported.

You can try library like expect4j to interact with the external process

Any command line utility (like wget,curl etc) and/or java methods to get meta data of a shortened URL?

I have stream of shortened URLs as received from twitter feeds. I don't need entire page of that URL, but basic meta information like expanded original URL, page title, timestamps etc. I can get the entire page containing these meta as well using curl,wget, but any quicker way to only get the meta? Also, is there any java classes/methods to do this like curl.

HTTP Head requests may be what you are looking for, here is an example that uses curl (in its Python implementation though): Python: Convert those TinyURL (bit.ly, tinyurl, ow.ly) to full URLS

PHP Problem : filesize() return 0 with file containing few data?

I use PHP to call a Java command then forward its result into a file called result.txt. For ex, the file contains this:
"Result is : 5.0"
but the function filesize() returns 0 and when I check by 'ls -l' command it's also 0. Because I decide to print the result to the screen when file size != 0 so nothing is printed. How can I get the size in bit ? or another solution available?

From the docs, when you call filesize, PHP caches this result in the stat cache.
Have you tried clearing the stat cache?
clearstatcache();
If it does not work, possible workaround is to open the file, seek to its end, and then use ftell.
$fp = fopen($filename, "rb");
fseek($fp, 0, SEEK_END);
$size = ftell($fp);
fclose($fp);
If you are actually planning to display the output to the user, you can instead read the entire file and then strlen.
$data = file_get_contents($filename);
$size = strlen($data);

Which function do you use ?
Because exec() can directly assign result to a variable, so maybe there's no need to save output to a file, if you just want to load it in PHP.

You say:
I use PHP to call a Java command then
forward its result into a file called
result.txt.
Who does the result writing?
1.The JAVA programm?
2.Do you catch the output in PHP and write it to the file.
3.Do you just redirect the output from command line?
If 1 and 3 you might have a delay between when the result is written in the file so, practically, when you read the file in PHP, if you don't wait for the execution to finish, you read it befor it even was written with the result.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Finding size of files locally/remotely - java

I'm writing a program to find the file size of files. Is it possible in java? In PHP I know there is a filesize(). Another alternative was using ab http:// in unix but how is it integratabtle with java? What do you think is the best/most efficient way to attack this?

Related

Transforming JSON with XSLT using SaxonEE and Python

Determining GZIPOutputStream behavior

How to pass argument to Java Process?

Any command line utility (like wget,curl etc) and/or java methods to get meta data of a shortened URL?

PHP Problem : filesize() return 0 with file containing few data?

Categories

Resources