Spring Data Elastic Search vs Java High Level REST Client - java

I'm new to Elastic search. We are building a Spring boot application with Elastic search.
For integrating my Spring boot application either we can use elasticsearch-rest-high-level-client or spring-boot-starter-data-elasticsearch.
Can anyone please elaborate on which option would be overall better and why?

spring-boot-starter-data-elasticsearch internally can use the transport(soon to be deprecated in ES 8.X) or rest-high-level-client Please refer elasticsearch client section for more information and how to configure them.
And from the same link :
Spring data Elasticsearch operates upon an Elasticsearch client that
is connected to a single Elasticsearch node or a cluster. Although the
Elasticsearch Client can be used to work with the cluster,
applications using Spring Data Elasticsearch normally use the higher
level abstractions of Elasticsearch Operations and Elasticsearch
Repositories.
Bottom line is that you can directly use rest-high-level client in your spring boot application but if you want more abstraction then you can use the spring-boot-starter-data-elasticsearch dependency and use its method which provides more abstraction although internally it would use the client offered by Elasticsearch.

Related

How to connect Kafka to legacy Spring

I am working on PoC of connecting legacy Spring application to Kafka. It is war application to be deployed in Tomcat, Spring version 4.3.12. Is there some library to make communication with Kafka almost as easy as with Spring Boot? I need just fundamental operations: sending message, listening for confirmation, receiving.
I have some experience with Spring Boot support as is provided in org.springframework.kafka:spring-kafka library. I am not sure how to efficiently adopt Kafka for legacy Spring - I'm thinking of using Kafka Java client which looks promising but as I am used to working at Spring Boot abstraction level I don't have clue how much code should I supply myself.
Web search is not much helpful in this case since it tends to show Spring Boot-related solutions. Migration of legacy application is considered too, I just need to have some idea how difficult each way is.
kafka-clients is all you need (from Maven Central, not Confluent). You could go a step further and look into Log4j2 Kafka bridge, then property files for that.
If you want to externalize config into regular Java .properties file, you can, or you can pull values from environment variables, if you follow 12-factor principles.
But if you don't already have Spring Boot dependencies, then I do not think adding them is worth it for only Kafka.
Also, the Spring-Kafka documentation covers how to configure your app without Boot.

Spring Batch without Spring Cloud Data Flow

I have a Spring Boot application that uses Spring Batch. I want now to implement an admin panel to see all job statuses. For this, Spring has "spring-batch-admin" But I see that is deprecated long time ago:
The functionality of Spring Batch Admin has been mostly duplicated
and
expanded upon via Spring Cloud Data Flow and we encourage all users to
migrate to that going forward.
But then Spring Cloud Data Flow says:
Pipelines consist of Spring Boot apps, built using the Spring Cloud
Stream or Spring Cloud Task microservice frameworks
So in order to use this functionality do I really need to convert my spring boot app to a microservice? Isn't this an overkill just to see some batch statuses? Also I can not install docker on my production server(for various reasons) Can I still use Spring Cloud Data Flow without docker?
Yes, spring boot batch should be wrapped as spring cloud task, which should not be too complicated.
If Docker does not suit your needs - https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-started-local-deploying-spring-cloud-dataflow

Hadoop integration with e-commerce portal

We are building a new e-commerce portal from scratch using java rest services and we are planning to use MySQL (for now, Oracle in the future). We are using ElasticSearch also. We are building this whole portal as microservices. My Questions is, do I need to take care of analytics from the beginning (like hadoop and HDFS integration) ?
Singular relational databases work fine, but they scale poorly. Especially for large scale web services.
You need to measure your data ingestion volume/size to determine if you need Hadoop (more specifically HDFS) for batch analytics on top of Elasticsearch. But likely not. You can use a Standalone Apache Spark cluster to run things against Elasticseach directly.
However, you could also use Kafka as a message bus between your JDBC compatible database as well as loading an Elasticsearch index. And Spark Streaming works great with Kafka.
And if you want to add Hadoop into the mix, you can just pull the same data from Kafka to fill in an HDFS directory.
There are many blogs talking about microservices communication via Kafka

spring boot monitoring in practice

spring boot actuator exposes /metrics endpoints. but it produces a value only when combined with monitoring tools, diagrams, alerting etc. so:
does spring-boot provides support for push-based metrics collection? if so, what's the tool?
or maybe there are some production-ready tools (with service registry etc) that work with spring-boot in pull-based manner and actually use the /metrics endpoint? for example prometheus perfectly discovers all EC2 instances but is incompatible with spring boot metrics (counters and format).
so is there any real world, production ready tools that can be used out of the box? or we're not there yet?

How does the Embedded Neo4j actually work?

I am new to neo4j and based on the reading I have done so far it seem there are two ways to interact with neo4j using Neo4j REST and Embedded. Where I am a little confused is does the Embedded option only give you the ability use the native Neo4j API to manipulate the datastore or can you also embed Neo4j and package it with your java application and if so how would I go about doing it?
As far as I know, Embedded term coined out to integrate neo4j with your application. In embedded mode, your db is locked and your application is solely authorized to access it. You can not access your db from any where else as far as your application is running and accessing it.
Where as in Neo4j Rest or Say Neo4j Server support REST API through which you can perform all the data store related operation via API call. In Rest API mode, you can handle your db externally using Neo4j GUI console along with your application.
Performance wise, I found embedded mode is much faster than Server mode.
does the Embedded option only give you the ability use the native Neo4j API to manipulate the datastore
You can use either of mode (Server REST API mode or embedded mode) to manipulate datastore.
Package with Java Application
it depends on your application configuration, in embedded mode you generally don't need external neo4j server running. You just need to explicitly mention your db path along with other configuration (I have used Spring data neo4j). Where as in Neo4j Server mode, you will require neo4j server running.
You can have look on this thread as well.

Categories