Obtaining census block groups from shapefile based on latlong inputs - Java - java

I am new to shapefile processing. Kindly guide me on how to achieve my below query.
I am using this shapefile tl_2018_us_aiannh.shp from census.gov : TIGER-LINE. I am to obtain the census block group entities like Block, Tract, County subdivision and County details from the shapefile based on the latitude and longitude provided by the user.
My requirement is to achieve this by shapefile alone and not through any API's.
Can someone help on which framework I can achieve this?
What I've tried/using so far:
I have used GeoTools to read the shapefile . Can I continue using the same? Will my requirement be achievable by this tool?
I have gone through a documentation from census.gov which states:
The Census Bureau assigns a code and these appear in fields such as
“TRACTCE”, where “CE” stands for Census. Finally, state-submitted
codes end in “ST”, such as “SLDLST”, and local education agency codes
end in “LEA”, as in “ELSDLEA”.
Which I tried in my code by:
File file = new File("D:\\tl_2018_us_aiannh.shp");
try {
Map<String, String> connect = new HashMap();
connect.put("url", file.toURI().toString());
DataStore dataStore = DataStoreFinder.getDataStore(connect);
String[] typeNames = dataStore.getTypeNames();
String typeName = typeNames[0];
System.out.println("Reading content " + typeName);
SimpleFeatureSource featureSource = dataStore
.getFeatureSource(typeName);
SimpleFeatureCollection collection = featureSource.getFeatures();
SimpleFeatureIterator iterator = collection.features();
try {
while (iterator.hasNext()) {
SimpleFeature feature = iterator.next();
GeometryAttribute sourceGeometry = feature
.getDefaultGeometryProperty();
String name = (String) (feature).getAttribute("TRACTCE");
Property property = feature.getProperty("TRACTCE");
System.out.println(property);
}
} finally {
iterator.close();
}
} catch (Throwable e) {
e.getMessage();
}
But I am receiving null as the value.
Any help would be much helpful.

I have found the solution to this. Hope this would be helpful to someone in need.
SimpleFeature is the type that has the attributes of shape files that you can check when you try to debug or print a line on runtime. You can use the SimpleFeature to get the property. The attributes can be achieved by:
try {
while (iterator.hasNext()) {
SimpleFeature feature = iterator.next();
Property intptlat = feature.getProperty("TRACTCE");
}
}
Make sure you are choosing the Block Groups as the layer type for download in Tiger-Line or which ever site is concerned, where you download the shape file.

Related

how to parse pdf file in java and save some data in mysql database

I want to parse a pdf file in java and extract some transactional data from it. I have used iText to read pdf. It returns whole pdf as a string. I am not able to extract data. What is the better approach to handle this?
Below are the content which i get after parsing my pdf file which is in string format and i need to filter the transactional data so that i can insert it into database.
Date Transaction Amount Units Price Unit
(INR) (INR) Balance
Birla Sun Li[e Mutual Fund
Folio No: 1016409683 PAN: AZMPB2802L KYC: OK PAN: OK
B291GZ-Birla Sun Life India GenNext Fund - Growth-Direct Plan(Advisor: DIRECT) Registrar : CAMS
Opening Unit Balance: 0.000
12-Mar-2014 Purchase 5,000.00 146.113 34.22 146.113
22-Apr-2014 Purchase - via Internet 1,500.00 41.993 35.72 188.106
05-May-2014 Purchase - via Internet 1,500.00 42.505 35.29 230.611
13-Jan-2015 Purchase - via Internet 1,500.00 28.604 52.44 259.215
3-Feb-2015 Purchase - via Internet 3,000.00 54.835 54.71 314.050
03-Mar-2015 Purchase - via Internet 3,000.00 53.476 56.10 367.5260
Valuation on 10-Mar-2016: INR 58,956.90
Closing Unit Balance: 1,143.462 NAV on 10-Mar-2016: INR 51.56
Depending on the specific situation you're in, you can try various approaches.
iText has a tool called pdf2Data, which sounds like it does exactly what you're looking for. It processes a document according to a template, and gives you an xml document. This is of course more suited for a commercial setting.
You can write your own extraction strategy, that handles the pdf document in a more clever way. Suppose for instance that you want to extract information from a table in a pdf document.
You would implement IEventListener, and listen for two kinds of events; line-drawing events (so that you get notified when the table is being drawn) and text-rendering events (to get the content in the table).
You then have to write several smart heuristics that define what constitutes a table. For a simple proof-of-concept you could simply look for lines that cross in 90 degree angles. Determine the bounding box. Then go looking for all text-rendering instructions within that box. Use another clever heuristic that is able to determine column and row-boundaries.
What you really need actually, IS a String and the procedure you're looking for is called parsing. Once you get that String containing the whole pdf, you need to write some "smart" (saying smart as it depends on the contents of your pdf) code that can split the main String into smaller, more useful (for you) parts.
After that, you need to setup a connection with a database with your java app and provide the necessary db code that will use the smaller String parts you created with your parser to fill up your tables.
Bellow, you can see some code i've written for an assignment which required parsing useful parts of a larger String object from a stream (in this case a .txt containing stats from a basketball game).
public static ArrayList<HashMap<String, String>> parse (InputStream input) throws IOException {
output = new ArrayList();
count = 0;
atcount = 0;
try (BufferedReader reader = new BufferedReader(new InputStreamReader(input))) {
if (input != null) {
while ((line = reader.readLine()) != null) {
if (line.contains("Team")) {
playerBounds = false;
team2Bounds = true;
count = 0;
}
if (playerBounds == true && team2Bounds == false) {
count++;
listΑ = line.split("\t");
addToList();
//printAll();
}
if ((playerBounds == true) && (team2Bounds == true)) {
count++;
listΑ = line.split("\t");
addToList();
//printAll();
}
if (line.contains("Player")) {
playerBounds = true;
}
atcount = 0;
}
}
} catch (IOException ex) {
throw ex;
} finally {
try {
input.close();
} catch (Throwable ignore) {
}
}
return output;
}
I hope this can help you :)

alfresco buildonly indexer for searching the properties created on the fly

I am using the latest version of alfresco 5.1 version.
one of my requirement is to create properties (key / value) where user enter the key as well as the value.
so I have done that like this
Map<QName, Serializable> props = new HashMap<QName, Serializable>();
props.put(QName.createQName("customProp1"), "prop1");
props.put(QName.createQName("customProp2"), "prop2");
ChildAssociationRef associationRef = nodeService.createNode(nodeService.getRootNode(storeRef), ContentModel.ASSOC_CHILDREN, QName.createQName(GUID.generate()), ContentModel.TYPE_CMOBJECT, props);
Now what I want to do is search the nodes with these newly created properties. I was able to search the newly created property like this.
public List<NodeRef> findNodes() throws Exception {
authenticate("admin", "admin");
StoreRef storeRef = new StoreRef(StoreRef.PROTOCOL_WORKSPACE, "SpacesStore");
List<NodeRef> nodeList = null;
Map<QName, Serializable> props = new HashMap<QName, Serializable>();
props.put(QName.createQName("customProp1"), "prop1");
props.put(QName.createQName("customProp2"), "prop2");
ChildAssociationRef associationRef = nodeService.createNode(nodeService.getRootNode(storeRef), ContentModel.ASSOC_CHILDREN, QName.createQName(GUID.generate()), ContentModel.TYPE_CMOBJECT, props);
NodeRef nodeRef = associationRef.getChildRef();
String query = "#cm\\:customProp1:\"prop1\"";
SearchParameters sp = new SearchParameters();
sp.addStore(storeRef);
sp.setLanguage(SearchService.LANGUAGE_LUCENE);
sp.setQuery(query);
try {
ResultSet results = serviceRegistry.getSearchService().query(sp);
nodeList = new ArrayList<NodeRef>();
for (ResultSetRow row : results) {
nodeList.add(row.getNodeRef());
System.out.println(row.getNodeRef());
}
System.out.println(nodeList.size());
} catch (Exception e) {
e.printStackTrace();
}
return nodeList;
}
The alfresco-global.properties indexer configuration is
index.subsystem.name=buildonly
index.recovery.mode=AUTO
dir.keystore=${dir.root}/keystore
Now my question is
Is it possible to achieve the same using the solr4 indexer ?
Or Is there any way to use buildonly indexer for a particular query ?
In your query
String query = "#cm\\:customProp1:\"prop1\"";
remove cm as you are building the QName on the fly so it does not come under cm i.e. (ContentModel) properties. So your query will be
String query = "#\\:customProp1:\"prop1\"";
Hope this will work for you
First, double check if you're simply experiencing eventual consistency, as described below. If you are, and if this presents a problem for you, consider switching to CMIS queries while staying on SOLR.
http://docs.alfresco.com/5.1/concepts/solr-event-consistency.html
Other than this, check if the node has been indexed at all. If it has, take a closer look at how you build your query.
How to find List of unindexed file in alfresco

Using ELKI with Mongodb

Using test cases I was able to see how ELKI can be used directly from Java but now I want to read my data from MongoDB and then use ELKI to cluster geographic (long, lat) data.
I can only cluster data from a CSV file using ELKI. Is it possible to connect de.lmu.ifi.dbs.elki.database.Database with MongoDB? I can see from the java debugger that there is a databaseconnection field in de.lmu.ifi.dbs.elki.database.Database.
I query MongoDB creating POJO for each row and now I want to cluster these objects using ELKI.
It is possible to read data from MongoDB and write it in a CSV file then use ELKI to read that CSV file but I would like to know if there is a simpler solution.
---------FINDINGS_1:
From ELKI - Use List<String> of objects to populate the Database I found that I need to implement de.lmu.ifi.dbs.elki.datasource.DatabaseConnection and specifically override the loadData() method which returns an instance of MultiObjectsBundle.
So I think I should wrap a list of POJO with MultiObjectsBundle. Now i'm looking at the MultiObjectsBundle and it looks like the data should be held in columns. Why columns datatype is List> shouldnt it be List? just a list of items you want to cluster?
I'm a little confused. How is ELKI going to know that it should look at the long and lat for POJO? Where do I tell ELKI to do this? Using de.lmu.ifi.dbs.elki.data.type.SimpleTypeInformation?
---------FINDINGS_2:
I have tried to use ArrayAdapterDatabaseConnection and I have tried implementing DatabaseConnection. Sorry I need thing in very simple terms for me to understand.
This is my code for clustering:
int minPts=3;
double eps=0.08;
double[][] data1 = {{-0.197574246, 51.49960695}, {-0.084605692, 51.52128377}, {-0.120973687, 51.53005939}, {-0.156876, 51.49313},
{-0.144228881, 51.51811784}, {-0.1680743, 51.53430039}, {-0.170134484,51.52834133}, { -0.096440751, 51.5073853},
{-0.092754157, 51.50597426}, {-0.122502346, 51.52395143}, {-0.136039674, 51.51991453}, {-0.123616824, 51.52994371},
{-0.127854211, 51.51772703}, {-0.125979294, 51.52635795}, {-0.109006325, 51.5216612}, {-0.12221963, 51.51477076}, {-0.131161087, 51.52505093} };
// ArrayAdapterDatabaseConnection dbcon = new ArrayAdapterDatabaseConnection(data1);
DatabaseConnection dbcon = new MyDBConnection();
ListParameterization params = new ListParameterization();
params.addParameter(de.lmu.ifi.dbs.elki.algorithm.clustering.DBSCAN.Parameterizer.MINPTS_ID, minPts);
params.addParameter(de.lmu.ifi.dbs.elki.algorithm.clustering.DBSCAN.Parameterizer.EPSILON_ID, eps);
params.addParameter(DBSCAN.DISTANCE_FUNCTION_ID, EuclideanDistanceFunction.class);
params.addParameter(AbstractDatabase.Parameterizer.DATABASE_CONNECTION_ID, dbcon);
params.addParameter(AbstractDatabase.Parameterizer.INDEX_ID,
RStarTreeFactory.class);
params.addParameter(RStarTreeFactory.Parameterizer.BULK_SPLIT_ID,
SortTileRecursiveBulkSplit.class);
params.addParameter(AbstractPageFileFactory.Parameterizer.PAGE_SIZE_ID, 1000);
Database db = ClassGenericsUtil.parameterizeOrAbort(StaticArrayDatabase.class, params);
db.initialize();
GeneralizedDBSCAN dbscan = ClassGenericsUtil.parameterizeOrAbort(GeneralizedDBSCAN.class, params);
Relation<DoubleVector> rel = db.getRelation(TypeUtil.DOUBLE_VECTOR_FIELD);
Relation<ExternalID> relID = db.getRelation(TypeUtil.EXTERNALID);
DBIDRange ids = (DBIDRange) rel.getDBIDs();
Clustering<Model> result = dbscan.run(db);
int i =0;
for(Cluster<Model> clu : result.getAllClusters()) {
System.out.println("#" + i + ": " + clu.getNameAutomatic());
System.out.println("Size: " + clu.size());
System.out.print("Objects: ");
for(DBIDIter it = clu.getIDs().iter(); it.valid(); it.advance()) {
DoubleVector v = rel.get(it);
ExternalID exID = relID.get(it);
System.out.print("DoubleVec: ["+v+"]");
System.out.print("ExID: ["+exID+"]");
final int offset = ids.getOffset(it);
System.out.print(" " + offset);
}
System.out.println();
++i;
}
The ArrayAdapterDatabaseConnection produces two clusters, I just had to play around with the value of epsilon, when I set epsilon=0.008 dbscan started creating clusters. When i set epsilon=0.04 all the items were in 1 cluster.
I have also tried to implement DatabaseConnection:
#Override
public MultipleObjectsBundle loadData() {
MultipleObjectsBundle bundle = new MultipleObjectsBundle();
List<Station> stations = getStations();
List<DoubleVector> vecs = new ArrayList<DoubleVector>();
List<ExternalID> ids = new ArrayList<ExternalID>();
for (Station s : stations){
String strID = Integer.toString(s.getId());
ExternalID i = new ExternalID(strID);
ids.add(i);
double[] st = {s.getLongitude(), s.getLatitude()};
DoubleVector dv = new DoubleVector(st);
vecs.add(dv);
}
SimpleTypeInformation<DoubleVector> type = new VectorFieldTypeInformation<>(DoubleVector.FACTORY, 2, 2, DoubleVector.FACTORY.getDefaultSerializer());
bundle.appendColumn(type, vecs);
bundle.appendColumn(TypeUtil.EXTERNALID, ids);
return bundle;
}
These long/lat are associated with an ID and I need to link them back to this ID to the values. Is the only way to go that using the ID offset (in the code above)? I have tried to add ExternalID column but I don't know how to retrieve the ExternalID for a particular NumberVector?
Also after seeing Using ELKI's Distance Function I tried to use Elki's longLatDistance but it doesn't work and I could not find any examples to implement it.
The interface for data sources is called DatabaseConnection.
JavaDoc of DatabaseConnection
You can implement a MongoDB-based interface to get the data.
It is not complicated interface, it has a single method.

Calling Microsoft Dynamics CRM 2011 online from JAVA

I'm doing a Dynamics CRM integration from a Java application and I've followed the example from the CRM training kit and managed successfully to connect and create accounts and contacts.
Now I'm having some problems with adding some more fields in the account creation and when connecting a contact with an account.
For instance I cannot create accounts with "address1_freighttermscode" that is a picklist.
My code is the following:
private static OrganizationServiceStub.Guid createAccount(OrganizationServiceStub serviceStub, String[] args) {
try {
OrganizationServiceStub.Create entry = new OrganizationServiceStub.Create();
OrganizationServiceStub.Entity newEntryInfo = new OrganizationServiceStub.Entity();
OrganizationServiceStub.AttributeCollection collection = new OrganizationServiceStub.AttributeCollection();
if (! (args[0].equals("null") )) {
OrganizationServiceStub.KeyValuePairOfstringanyType values = new OrganizationServiceStub.KeyValuePairOfstringanyType();
values.setKey("name");
values.setValue(args[0]);
collection.addKeyValuePairOfstringanyType(values);
}
if (! (args[13].equals("null"))){
OrganizationServiceStub.KeyValuePairOfstringanyType incoterm = new OrganizationServiceStub.KeyValuePairOfstringanyType();
incoterm.setKey("address1_freighttermscode");
incoterm.setValue(args[13]);
collection.addKeyValuePairOfstringanyType(incoterm);
}
newEntryInfo.setAttributes(collection);
newEntryInfo.setLogicalName("account");
entry.setEntity(newEntryInfo);
OrganizationServiceStub.CreateResponse createResponse = serviceStub.create(entry);
OrganizationServiceStub.Guid createResultGuid = createResponse.getCreateResult();
System.out.println("New Account GUID: " + createResultGuid.getGuid());
return createResultGuid;
} catch (IOrganizationService_Create_OrganizationServiceFaultFault_FaultMessage e) {
logger.error(e.getMessage());
} catch (RemoteException e) {
logger.error(e.getMessage());
}
return null;
}
When it executes, I get this error
[ERROR] Incorrect attribute value type System.String
Does anyone have examples on how to handle picklists or lookups?
To connect the contact with the account I'm filling the fields parentcustomerid and parentcustomeridtype with the GUID from the account and with "account", but the contact does not get associated with the account.
To set a picklist value you must use an OptionSet and for a lookup you must use an EntityReference. See the SDK's C# documentation, should work the same way using the Axis generated Java code.
incoterm.setKey("address1_freighttermscode")
//assuming the arg is an integer value that matches a picklist value for the attribute
OptionSetValue freight = new OptionSetValue();
freight.Value = args[13];
incoterm.setValue(freight);
collection.addKeyValuePairOfstringanyType(incoterm);
I haven't worked with Java for over a decade (and never towards an MS creation like Dynamics) so it might be way off from what you like. :)
You could use the REST web service and call directly to CRM creating your instances. As far I know, that's platform independent and should work as long as you can connect to the exposed service OrganizationData.

Google App engine memcached design

I am new to memcache of GAE and I need a help in this.
Basically, I have a datastore which exceeded the Datastore Read Operations limit because of the fact that I didn't use memcache. My datastore has minimal writes but many reads and every time there's a write, it should be available for the read. Since, the site is up and I need a quick resolution for it so I need a design help in this. So the thing is, whenever there's a write in the datastore the new entry should get memcached. One more thing I would like to know that how the datastore can be replicated to the memcache. In parallel, I am working on it but since the site is up I am asking it here without any code in hand.
Thanks
UPDATE:
Java code looks like this:
MemcacheService memcache = MemcacheServiceFactory.getMemcacheService();
if(memcache.contains("LocationInfo"))
{
JSONArray js = new JSONArray((String)memcache.get("LocationInfo"));
result = new ArrayList<LocationInfo>();
for(int i = 0; i < js.length(); i++)
{
JSONObject jso = (JSONObject)js.get(i);
LocationInfo loc = new LocationInfo(jso);
result.add(loc);
}
}
else
{
q1= pm.newQuery(LocationInfo.class);
q1.setFilter(filter);
result = (List<LocationInfo>)q1.execute();
JSONArray js = new JSONArray();
for(LocationInfo loc : result)
{
js.put(loc.toJSON());
}
memcache.put("LocationInfo", js.toString());
}
from google.appengine.ext import db
from google.appengine.api import memcache
def top_arts(update = False):
key = 'top'
#Getting arts from memcache
arts = memcache.get(key)
#Check if key is defined in memcache
#or an update has been invoked
if update or not arts:
#Querying the Google Data store using GQL
arts = db.GqlQuery('SELECT * from Art ORDER BY created DESC LIMIT 10')
memcache.set(key, arts)
return arts
You can use the same function for reading from memcache and then writing data into
memcache
Eg:
for reading from memcache:-
arts = top_arts()
when writing into database:-
#write your entry in database
<some database code>
#update memcache with this new entry
top_arts(update=True)

Categories