Develop algorithm for determining data availability [closed]

Develop algorithm for determining data availability [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
We are using a Change Data Capture tool to migrate source data to a target database in near real-time.
The challenge is to identify as accurately as possible the data migration latency that exists between
source and target. The latency reporting capabilities of the tool are not to our satisfaction and so I
have been tasked with developing a process that will better monitor this specific metric.
There are two main reasons why we need to know this:
1: Provide our users with an accurate data availability matrix to support report scheduling. For example,
How much time should pass after midnight before scheduling a daily reconciliation report for the
previous day given that we want this information as soon as possible?
2: Identify situations when the data mirroring process is running slower than usual (or has even stopped).
This will trigger an email to our support team to investigate.
I am looking for some general ideas of how to best go about this seemingly simple task

My preferred approach is a dedicated heartbeat or health-check table.
At the source the table has an identity column (SQLserver) column or value from a sequence (Oracle) as main identifier; a fixed task name string; fixed server string (if no already identified by the taskname; and the current time.
Have a script/job on the source to insert a record every minute (or 2 minutes or 10 minutes)
In the CDC engine (if there is one), add a column with the time the change event was processed.
At the target, add a final column defaulting to the current time at insert.
A single target table can accommodate multiple sources/tasks.
The regular blibs will allow one to see at a glance whether changes are coming true, whether the application is generating changes or not.
A straightforward report can show the current latency, as was as the latency over time.
It is nice to be able to compare 'this Monday' with 'last Monday' to see if things a similar, better or worse.
Cheers, Hein.

Related

REST API Design for Complex Operations [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
Queries -
Working on building a Rest API to support search/filtering of transactions. Below are the two requirements, api is expected to support
Retrieve list of transactions by an array of [transactionid] - minimum is 1
Retrieve list of transactions by ( (transactionType=Sale OR transactionType=Refund) AND storeid=XXXX))
I am looking to design this as POST request with assuming search as resource something like the below. I am puzzled on second requirement above in complex querying with "AND" "OR" operations. Any inputs around it will be deeply appreciated. API is expected to supported search of transactions on varied attribute combinations
Design for Requirement 1
POST /sales-transactions/search
{
searchrequestid:xxxxxx
transactionids:[1,2,3,4,..]
}

If "retrieve" is an essentially read only operation, then you should be prioritizing a design that allows GET, rather than POST.
Think "search form on a web page"; you enter a bunch of information into input controls, and when you submit the form the browser creates a request like
GET /sales-transactions/search?searchrequestid=xxxxxx&transactionIds=1,2,3,4...
Query parameters can be thought of as substitutions, the machines don't care which parameters are being used for AND and which are being used for OR.
select * from transactions where A = :x and B = :y or C = :z
GET /sales-transactions/search?A=:x&B=:y&C=:z
Because the machines don't care, you have the freedom to choose spellings that make things easier for some of your people. So you could instead, for example, try something like
GET /sales-transactions/AandBorC?A=:x&B=:y&C=:z
It's more common to look into your domain experts language for the name of the report, and use that
GET /some-fancy-domain-name??A=:x&B=:y&C=:z
When we start having to support arbitrary queries through the web interface, and those queries become complicated enough that we start running into constraints like URI lengths, the fall back position is to use POST with the query described in the message body.
And that's "fine"; you give up caching, and safety, and idempotent semantics; but it can happen that the business value of the adhoc queries wins the tradeoff.

Finding the right Architectural pattern for flight search engine [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am working on a college project in Java. One of the first tasks we were given is to choose an Architectural pattern for our project (MVC, Repository, layers, etc) and create it visually.
I found a lot of different examples for Architectural patterns over the internet but I cant find anything that matches 100% the idea of the project.
I also couldn't find an Architectural pattern example for a similar project (flight search engine system).
I'd appreciate any help finding the right Architectural pattern for the system we're creating in our project. Details about the system below:
Main functions: sign up, login, search, place an order, export reports for the travel agent/ agency as a whole.
Only a travel agent (with certificate) or a travel agent from a travelling agency can sign up to the system and use it. It is not possible for the passenger to use the system.
The agent can run a search. The results of the searches are pulled from a static JSON file (it is not a complex system, so it is not taking from real time database or something. We just shuffle the file every 2 hrs or so).
The search has different filters, including destination, origin country, number of passengers, one way or two and other non- mandatory fields.
The results are listed by best to worse ( pricing and shortest path). The algorithm to calculate the price is pretty simple and is based on airline company type (charter or scheduled flight, day in the week, season, holidays, etc).
If the customer (passenger) is interested in it, the travel agent can order it for him/her. An email with order details will be sent to the customer.The seats available on that ordered flight will be reduced accordingly and changed in the specific airline company file we allocated for it.
In addition, an export option is available for the agent to view all of the orders the made for all time and in specific dates as well. Cancellation is possible too.
That's it about the project,
I'd appreciate any help!
Thanks!

I should consider changing the term "architectural pattern" into architectural style. Then, I should think about the fact that an architecture is a set of multiple architectural styles that are composed together into a system.
As I've said, you should choose multiple architectural style, not a single one, when designing a system. From the description posted by you I should use an MVC approach for the web layers: login, signup, place order, where I will use models, views and controllers. I suppose that you will read in detail about what is a model, a view and a controller.
Also, I will use a layered ports-and-adapters/onion-architecture style for a better decoupling of the code. Use adapters for interaction with external systems such as the database. Think in terms of domain model using domain entities, aggregates and repositories.
Good luck!

How to read information from certain resources on the Internet and save it in a database [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm new to Java development and android. I need to implement functionality for the currency convector.
I need to know how to take information from the Internet (and periodically update it) and save it in a database.

For making a currency converter, you would need to fetch the conversion rates in real time.these are a some ways to achieve this in your client-
1)Make ajax calls to apis(find some 3rd party service to fetch the conversion rates at real time , something like this - How do I get currency exchange rates via an API such as Google Finance? ) or alternatively use something like - (Web scraping with Java - which needs you to write your own back end which is responsible for making the google search and retrieving the exchange rates)
2)Whenever a conversion is executed by an end user, make a call to the decided apis in the step 1 and use the updated exchange rate to calculate and provide results- I would say, don't go for persisting the values in the database, as the foreign exchange market is constantly “live”, meaning that it never closes, even at night, so the exchange rate is always changing.(https://www.purefx.co.uk/foreign-currency-exchange-insight/view/do-exchange-rates-change-daily )
3) You could save the results in cache and update the cache at some periods of time using a cron job (maybe use Executorservice if you are writing a back-end in java) but in this case the rates might not be the latest ones and any satisfactorily "live" values means very noisy api calls at the server-side even when there are no users actively using the client.You might want to trade off accurate conversion with your resource usage and update the server side cache in intervals of 1 hour say)
4) Same apis could also be used to fetch the current rates ,if end user just wants to check the exchange rates
5)If you at all want to save the values in a database, you could use a key value type of database like Redis (https://redis.io/topics/client-side-caching)
Hope this helps.

Advice required on choosing SQL or NoSQL framework for searching / persisting [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
We are trying to build to backend for a Job Portal for which we are building android and iPhone clients.
Here is a basic field which needs to be persisted/searchable.
User meta data and their preferences needs to be stored.
category to which user below ( single value )
skills of user. ( Multi value )
User location in text and in latlng
Job data and their searchable fields.
Job category ( single value )
Job skills ( Multi value )
Job location in text as well as in latlng
Some of the basic use cases :
When job is about to get posted, we should be able to get candidate list nearby location based on job category/skills and latlng.
When job is posted, it has to match the actual candidates and get their meta information and persist in another table/schema.
When a new user on boards, get suitable jobs for candidate and store in another table.
This data will be served to android/iphone and web dashboard for serving real time data.
Need your suggestions for choosing the framework considering factors of HA, Scalability, reliability and cost.

You might want to use both MySql and Solr for different purposes. For persisting the data, it better to use MySql or like database because they will provide you all the ACID properties. You should index your job and user data to Solr/Lucene which can serve the real time search on your platform and provide suggestion for auto-completion feature. Solr also provides geo-location search, which could be used to match users and jobs. You can always build recommendation feature on that. CloudSolr can be configured to for HA and Scalability.

For searching give Solr/Lucene a try. It is extremely scalable, battle-tested and mature.

They are different tools with different advantages and disadvantages. Both are used on a massive scale. For small projects the business answer is probably "Do what you know, because it will save you developer-hours."
Just be sure it doesn't lock you into a situation where it's hard to make a change you want down the road. See, e.g., http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/

Restful response time varies irrespective of the input size? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I created a simple rest service POST method that consumes a XML. I created a REST client with Jersey and created my object and i am trying to see the variations in response time by increasing my XML length, by giving larger inputs to my objects. Say, my objects takes up a simple employee details, i will increase . I see that the response time inconsistently varies , from my observation it is not dependent on the size of the xml. I am computing the time taken as follows.
long startTime = System.currentTimeMillis();
// enter code here for post `
long elapsedTime = System.currentTimeMillis() - startTime;
Please Suggest if there is a better way of doing it.
Here what i would like to get clarified is my server is in the local host and why the response time varies (say once it is 88ms and the other time it is 504ms ). What i expect is it should increase when i am giving larger inputs to my xml object but that does not happen as i observe. Please clarify or point me to a better site or book where i can read about the same.

Note that your question is quite broad (and will likely be closed as such). My explanation is similarly broad and just meant to give you some background on why you might see the behavior that you are seeing.
It is unlikely that the way you measure time makes a big difference here, given that you are up to hundreds of milliseconds. It is more likely that the web site that you are invoking is sometimes takes longer to respond.
It may help to compare it to what you see when you type a search query into Google. Sometimes the response pops up "instantaneously", but sometimes it takes a few moments to load. Since you're using the same browser for every search, it can't be the browser causing the difference.
Instead it is likely something in the web service that is varying between the calls. In the Google search example, you might be routed to a different Google server, the server might be using a different storage to search, the query might be cached at some point, etc.
So there is no way to determine why the performance is different between invocations by simply looking at the client code. Instead, you should look at the server code and profile what happens there for different invocations.

use System.nanoTime instead of currentTimeMillis; also be aware that Java optimises on the fly and if you do not warm the server first then you will be timing interpreted execution of the byte code and then at some point it will flip to native byte code. The timings will differ by input length (in a step like function) but you have a certain amount of noise to overcome before you will see that. A lot from GC, a lot from thread scheduling and OS issues.
I recommend the mechanical sympathy group for discussions on this sort of thing. There are many discussions on how to measure these types of systems. https://groups.google.com/forum/#!forum/mechanical-sympathy

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.