I am looking for a good possibility to store data online. I am programming with Java and Android. So there should be an interface to get access to these data. The most files are images. There is an increasing number of images. One image has a file size of nearly 200kb.
What is a common way to store these data? I need a good performance. So there should be a fast response and unlimited traffic. Maybe you can show me some options for secure data storage.
I have looked for webservers to store data. But many of these do not allow to store application data like images.
As I understand, you don't want to use DB for storing images. Ok, so the solution is to use file storage. You may want to take a look on the Amazon S3 (to my mind, great solution for storing static content) or Google Cloud Storage
Related
I am working on a project where I have extracted images from sensor and saved them to the operating system directory. I have a Java API for uploading images to the server.
I need to upload these images and some other data typically float data type to the main server.
I need to decide an inter-mediator such as a database where I store those images and make connection through java to upload them or use HDFS.
Can some body please advise me, which option will be best for storing images? Database or HDFS?
Note: Images are up to 150 thousand can be more in future.
I think the best way to do that is to keep the floats you need and metadata of the images in the database. For easier searching and querying and easier interaction with the Java. The actual images are best stored on a file system to decrease the transformation from and to the database. I believe a simple file system would be good enough for that size of images. You probably won't use any of the fancy HDFS functions like map reduce and stuff like that. But that's up to you.
So in this case if a standard file system isn't good enough for you and you want something bigger then HDFS is the way to go. So the proper way would be a mixture of the two.
It totally depends on the usecase , you can choose
HDFS : when you wanna read them as a whole or transfer or process them to do any manipulation upon the images data and store or do someother action based on the processed results. In simple, if you wanna do Map-Reduce operation. And reading images in HDFS is sequentially , if you wanna perform to fetch particular image based on certain selection criteria, then it costly and performance impacted operations.
Database : It is better for query based operation where you wanna query or do DML operations upon images on certain criteria basis, In simple, WHERE conditions. But this is totally time consuming process, when you wanna process as a chunk. And the performance will be obviously very slow as you wanna store 150thousand of images
So My suggestion based on the requirement, you wanna store images as intermediate, it will be better to store in HDFS itself.
150.000 images is not considered a huge amount today. If an average of 10 MB is assumed for each image (uncompressed) the amount of data is 1.5 TB, which should be possible to store in an off-the-shelf database (with off-the-shelf hardware, i.e. a Linux box with some RAID disks) like postgreSQL. I'm no expert in HDFS even though I tried products in the same family as HDFS I find them easy to use, I guess you could try Hadoop then for processing of the images as well if you are looking for a way to parallelize the processing. Even though this product family is nice I would still use a standard database like postgreSQL if parallelisation is not really needed by nature (like you get in HDFS).
I am wondering if there is a way to cache arbitrary data from web requests onto the disk with Android. The flow I am thinking of is as follows:
The data is stored as a key value pair where the key is some identifier and the value is the raw data. Before actually making my web request, I check to see if the key is in the cache, if so, I skip making the web request. If the key does not exist in the cache, then I make the web request and store the data on the disk. I would like the cached data to be accessible across multiple runs of the app so that I don't have to make the web request again every time I start the app.
I was considering using SharedPreferences for this. Would SharedPreferences be the best way to go about this? Is it okay to store 1 megabyte of data in a single key in SharedPreferences?
The best solution to storing cache files is to store them in a cache directory. Luckily, the Android API provides a solution to this problem: Context#getCacheDir. You are able to create files in the directory returned, you can use a map to store an identifier for each file in order to retrieve them.
Although, this solution has a few limitations:
The system will automatically delete files in this directory as disk space is needed elsewhere on the device.
Cache data should only be used for temporary storage of information.
I may be coming late, but a couple years ago I made a library just for this:
https://github.com/fcopardo/EasyRest
The idea is to allow the app to operate with unstable or no connection without having to implement a secondary data layer for persisting data, instead, it keeps the responses for as long as you want, and refresh them without forcing the user to wait. Take a look, you may get some ideas.
I need to sore attachments at server side. I can store them either under blob column of database or under file directory.
My question is which one is more reliable, scalable and maintainable?
EDIT:-
if we go for file system, we have to handle synchroniztion yourself. Is n't it ? For example if two users are trying to create/update the File under same directory how will we handle concurrency with filesystem?
Storing data in directory is more reliable due to indexing and data fetch and other operation. Just store the path of the file into DB and store that file into directory.
When there's lot's of data store request came on server it's very hard and complex to handle so much request.
So it's better to store data on directory so accessing of data becomes more faster and when the daily scale of DB storage increase then these become so important so when you start any system first of all study it well and then decide that what to do or which technique will be the best ?
When more data are there in DB then clustering and indexing become more important.
If you want to use it for small data storage then blob it good option but for large data I ll not recommend you because I have made online data store web application and faced this situation so at end I have used to store data in directory and just path in DB.
I am working with an application which has offline mode. In order to do that we store the information in a local SQLite Database and using Content Provider which provides a wrapper around the SQLite, and sync it every once in a while with the data from the web service.
We are also keeping the images which are taken by user on the sdcard and send them to the server during the sync service.
The problem is bandwidth and data usage. In Android 4.0+, we have a section in device setting named Data usage. It is showing too much data usage and it annoyed the users.
My first question is : Do you think using ProGaurd which is a tool to shrink the code, can have any impact on reducing the Data Usage?
I would appreciate if you share any experience and suggestion with me in order to reduce the Data usage in such an app.
Addenda:
1 - User login to the system and during first sync sqlite file generated and transferred from REST (initialization).
2 - We have sync-status flag for entries in database. If record(using json string for data) or picture is not synced, it will transfer to the REST during sync and status-flag get updated.
3 - An updated database file receives from REST and merge with the current database on the phone in the sync service (if initialization is already done).
ProGuard has nothing to do with the amount of data you send/receive from a server. ProGuard can shrink and obfuscate code (thus making your APK smaller).
You need to analyze the data you send and receive. There is no silver bullet here that will magically solve any bandwidth issues you may come across in an app. You need to ask yourself several questions and take action depending on your answers:
What kind of numbers are we talking about?
In 2011 the average bandwidth use of an app was around 10MB per hour. There are probably more recent surveys if you search a bit. Are you far above the average number? If not, then I don't think you have to worry too much.
How often do you send and receive data?
If it's a real-time app that absolutely require live data then there's little you can do. If it's not a real-time app maybe you can reduce the frequency of send/receive or wait and collect more data before sending it to reduce overhead? If you're sending many small chunks of data you'll get a lot of overhead in HTTP headers and so on. Hold on to the small chunks a while longer and send them in one go to change the data to overhead ratio.
Can you change the protocol?
Maybe you can send data over a socket instead of HTTP to reduce overhead? By your description it doesn't sound like this would work in your case.
Can you compress data before sending it?
Make sure that your server GZips data before sending it to the client. There is a lot to gain by doing this.
Can you use another data format (binary, json, xml, custom)?
You mention that you use JSON. JSON usually/always perform better than XML, so you're already good there, but maybe you can send data in another format that is even more compact?
I'm using Google App Engine and i need to store a big file (2-20Mb). It is a text file that i convert to a JSONArray. I need to be able to add JSONObjects to this array and to be able to read it.
I wanted to use Blobs but I noticed that blobs can't be updated(is it true??).
I don't want to enable billing thus, I can't use FileService(or...?).
Storing eache JSONOBject in the db explodes my reading quota.
With cache, the objects are sometimes removed.
Do you see a way to solve my problem?
Best regards!
This is what the blobstore is for.
https://developers.google.com/appengine/docs/java/blobstore/overview
The Blobstore API allows your application to serve data objects, called blobs, that are much larger than the size allowed for objects in the Datastore service. Blobs are useful for serving large files, such as video or image files, and for allowing users to upload large data files. Blobs are created by uploading a file through an HTTP request.
You get free quote here also.
No, you can't change them once you have uploaded them. If you want to do that then store your data as structured data in the datastore instead. But you can delete and replace blobs.