Lets see what we have. First file [Interface Class]:
list arrayList
list linkedList
Second file[Class countOfInstanse]:
arrayList 120
linkedList 4
I would like to join this two files by key[Class] and get count per each Interface:
list 124
and code:
public class Main
{
public static void main( String[] args )
{
String docPath = args[ 0 ];
String wcPath = args[ 1 ];
String stopPath = args[ 2 ];
Properties properties = new Properties();
AppProps.setApplicationJarClass( properties, Main.class );
AppProps.setApplicationName( properties, "Part 1" );
AppProps.addApplicationTag( properties, "lets:do:it" );
AppProps.addApplicationTag( properties, "technology:Cascading" );
FlowConnector flowConnector = new Hadoop2MR1FlowConnector( properties );
// create source and sink taps
Tap docTap = new Hfs( new TextDelimited( true, "\t" ), docPath );
Tap wcTap = new Hfs( new TextDelimited( true, "\t" ), wcPath );
Fields stop = new Fields( "class" );
Tap classTap = new Hfs( new TextDelimited( true, "\t" ), stopPath );
// specify a regex operation to split the "document" text lines into a token stream
Fields token = new Fields( "token" );
Fields text = new Fields( "interface" );
RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ \\[\\]\\(\\),.]" );
Fields fieldSelector = new Fields( "interface", "class" );
Pipe docPipe = new Each( "token", text, splitter, fieldSelector );
// define "ScrubFunction" to clean up the token stream
Fields scrubArguments = new Fields( "interface", "class" );
docPipe = new Each( docPipe, scrubArguments, new ScrubFunction( scrubArguments ), Fields.RESULTS );
Fields text1 = new Fields( "amount" );
// RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ \\[\\]\\(\\),.]" );
Fields fieldSelector1 = new Fields( "class", "amount" );
Pipe stopPipe = new Each( "token1", text1, splitter, fieldSelector1 );
Pipe tokenPipe = new CoGroup( docPipe, token, stopPipe, text, new InnerJoin() );
tokenPipe = new Each( tokenPipe, text, new RegexFilter( "^$" ) );
// determine the word counts
Pipe wcPipe = new Pipe( "wc", tokenPipe );
wcPipe = new Retain( wcPipe, token );
wcPipe = new GroupBy( wcPipe, token );
wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );
// connect the taps, pipes, etc., into a flow
FlowDef flowDef = FlowDef.flowDef().setName( "wc" ).addSource( docPipe, docTap ).addSource( stopPipe, classTap ).addTailSink( wcPipe, wcTap );
// write a DOT file and run the flow
Flow wcFlow = flowConnector.connect( flowDef );
wcFlow.writeDOT( "dot/wc.dot" );
wcFlow.complete();
}
}
[I decided to resolve this issue step-by-step and left final result here for others. So first step - Couldn`t join two files with one key via Cascading (Not Completed yet) ]
I would convert the two files to two Map objects, iterate through the keys and sum up the numbers. Then you can write them back to a file.
Map<String,String> nameToType = new HashMap<String,String>();
Map<String,Integer> nameToCount = new HashMap<String,Integer>();
//fill Maps from file here
Map<String,Integer> result = new HashMap<String,Integer>();
for (String name: nameToType.keyset())
{
String type = nameToType.get(name);
int count = nameToCount.get(type);
if (!result.containsKey(type))
result.put(type,0);
result.put(type, result.get(type) + count);
}
Related
Using Java Reflection:
How do I generically access arrays of other objects to retrieve their values ?
Given this Java structure:
class Master
{
static class innerThing
{
static StringBuilder NumOfThings = new StringBuilder( 2);
static class Thing_def
{
static StringBuilder field1 = new StringBuilder( 3);
static StringBuilder field2 = new StringBuilder( 3);
static StringBuilder field3 = new StringBuilder(13);
}
static Thing_def[] Things = new Thing_def [2];
static { for (int i=0; i<Things.length; i++) Things[i] = new Thing_def(); }
}
}
Using Reflection in this bit of code:
Field[] FieldList = DataClass.getDeclaredFields();
if (0 < FieldList.length )
{
SortFieldList( FieldList );
System.out.println();
for (Field eachField : FieldList)
{
String fldType = new String( eachField.getType().toString() );
if ( fldType.startsWith("class [L") )
System.err.printf("\n### fldType= '%s'\n", fldType); //$$$$$$$$$$$$$$$
if ( fldType.startsWith("class java.lang.StringBuilder") )
{
g_iFieldCnt++;
String str = DataClass.getName().replaceAll("\\$",".");
System.out.printf("%s.%s\n", str, eachField.getName() );
}//endif
}//endfor
}//endif
I get the following output:
(Notice that it shows one copy of the fields in Thing_def.)
Master.innerThing.NumOfThings
### fldType= 'class [LMaster$innerThing$Thing_def;'
Master.innerThing.Thing_def.field1
Master.innerThing.Thing_def.field2
Master.innerThing.Thing_def.field3
In another part of the system I access the fields to generate a CSV file:
Field[] FieldList = DataClass.getDeclaredFields();
if (0 < FieldList.length )
{
for (Field eachField : FieldList)
{
String fldType = new String( eachField.getType().toString() );
if ( fldType.startsWith("class java.lang.StringBuilder") )
{
Field fld = DataClass.getDeclaredField( eachField.getName() );
StringBuilder sb = (StringBuilder)fld.get(null);
CSV_file.printf("%s,", sb ); // emit column to CSV
//fld.set( DataClass, new StringBuilder() );
}//endif
}//endfor
}//endif
So in this case I actually will need to directly access array elements.
That is, I need to get at each Master.innerThing.Thing[n].field
So, the big question is:
How do I generically access arrays like this ?
How do I know that Thing_def does not have data,
it is merely a structural definition for Things[ ] ?
I have tried numerous ways and followed some of the examples that are scattered around the web on how to write a jagged array (an array of arrays that may be of differing lengths) in HDF5.
Most of the examples are in C and rather low-level. Anyhow I can't seem to get it working and I just looked at the C-source code and it pretty much says that any variable-length datatypes that are not strings are not supported (if I understood correctly).
My miserable dysfunctional code (as is):
public void WIP_createVLenFloatDataSet( List<? extends Number> floats ) throws Exception
{
String group = "/test";
long groupId = createGroupIfNotExist( group );
MDataQualifier qualifier = new MDataQualifierImpl( group, "float", "0.0.0" );
long datasetId = openDataSet( qualifier );
long heapType = H5.H5Tcopy( MDataType.FLOAT_ARRAY.getHDFType() );
heapType = H5.H5Tvlen_create( heapType );
// heapType = H5.H5Tarray_create( heapType, 1, new long[]{1} );
if( !exists( datasetId ) )
{
long[] maxDims = new long[]{ HDF5Constants.H5S_UNLIMITED };
long dataspaceId = H5.H5Screate_simple( 1, new long[]{ 1 }, null );
// Create the dataset.
long datasetId1 = -1;
try
{
if( exists( m_fileId ) && exists( dataspaceId ) && exists( heapType ) )
{
long creationProperties = H5.H5Pcreate( HDF5Constants.H5P_DATASET_CREATE );
H5.H5Pset_chunk( creationProperties, /*ndims*/1, new long[]{ 1 } );
datasetId1 = H5.H5Dcreate( groupId, qualifier.getVersionedName(), heapType, dataspaceId, H5P_DEFAULT, creationProperties, H5P_DEFAULT );
// H5.H5Pclose( creationProperties );
}
}
catch( Exception e )
{
LOG.error( "Problems creating the dataset: " + e.getMessage(), e );
}
datasetId = datasetId1;
if( exists( datasetId ) )
{
// flushIfNecessary();
LOG.trace( "Wrote empty dataset {}", qualifier.getVersionedName() );
}
}
List<? extends Number> data = ( List<? extends Number> )floats;
// H5.H5Dwrite( datasetId, heapType, dataspaceId, memSpaceId, HDF5Constants.H5P_DEFAULT, Floats.toArray( data) );
ByteBuffer bb = ByteBuffer.allocate( data.size() * 4 );
floats.forEach( f -> bb.putFloat( f.floatValue() ) );
// H5.H5Dwrite( datasetId, heapType, H5S_ALL, H5S_ALL, H5P_DEFAULT, Floats.toArray( data ) );
H5.H5Dwrite( datasetId, heapType, H5S_ALL, H5S_ALL, H5P_DEFAULT, bb.array() );
}
Has anyone done this before and can at least confirm that it's not possible?
The most I can get out of HDF5 is the message "buf does not support variable length type".
Apparently the "glue code" of the JNI wrapper doesn't support this. If you want to use this feature you either have to implement your own JNI or wait for a newer version. The official JNI code is open source and can be found here.
I have values that coming from MongoDB stored in a DBObject. and I needed to store that value in a set one by one. As being new to MongoDB I am not actually getting idea coherently how to proceed that.
String date = sdf.format(cal2.getTime());
List<String> dateList = new ArrayList<String>();
for (int i = 0; i < 6; i++) {
Date dateParsed = sdf.parse(date);
dateParsed.setDate(dateParsed.getDate() - i);
dateList.add(sdf.format(dateParsed));
}
Set<String> values2= new HashSet<String>();
for (String str : dateList) {
BasicDBObject find1 = new BasicDBObject("_ky", str);
DBObject values1= someDB.findOne(find1);
Iterator iter = values1.iterator(); /*giving error the method not found (becasue values1 is a dbObject)*/
while (iter.hasNext()) {
values2.add(//???//);
}
}
Any help on how can I iterate the DBObject- values1 and add those values in a set- values2 would be great.
You can call values1.keySet() and iterate over that and get() any values or use values1.toMap() and iterate that Map like you would any other.
The primary abstraction in the Mongo Java Driver is the DBObject which acts like a wrapper around Java's Map<String,Object>.
import java.util.Arrays;
import com.mongodb.BasicDBList;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
public class DBObjectKeySetDemo {
public static void main( String[] args ) {
DBObject dbo = new BasicDBObject( "firstName", "Robert" ).append( "lastName", "Kuhar" );
dbo.put( "age", 49 );
dbo.put( "hobbies", Arrays.asList( "Fly Fishing", "Board Games", "Roller Derby" ) );
DBObject browser = new BasicDBObject( "implementation", "Chrome" );
browser.put( "vendor", "Google" );
BasicDBList bookmarks = new BasicDBList();
bookmarks.add( new BasicDBObject( "name", "StackOverflow" ).append( "URL", "http://stackoverflow.com" ) );
bookmarks.add( new BasicDBObject( "name", "MMS" ).append( "URL", "https://mms.mongodb.com" ) );
browser.put( "bookmarks", bookmarks );
dbo.put( "browser", browser );
for ( String key : dbo.keySet() ) {
System.out.println( "key: " + key + " value: " + dbo.get( key ) );
}
System.out.println( "dbo: " + dbo );
}
}
The "gotcha" is that you can only work directly with "top level" element. For example, in the above example, through the Java API, you have no way to directly reference "browser.vendor". Through the Java API you would have to first get the "browser" sub-document, and then get the "vendor" field.
Clear? As mud? It helped me to just think of the abstraction as a Map<String,Object> where Object, in the case of a sub-document, may itself be a Map<String,Object>.
I'm trying to add multiple-values option into my ini file from Groovy using ini4j with following codes (I tried some variants):
import org.ini4j.Wini
List valuesList = [ 'val1’, ‘val2’, ‘val3' ]
( new Wini( new File( "test.ini" ) ) ).with{
valuesList.each{
put( 'sectionNa'sectionName','optionName', it)
}
store()
}
import org.ini4j.Wini
List valuesList = [ 'val1’, ‘val2’, ‘val3' ]
( new Wini( new File( "test.ini" ) ) ).with{
Section sectionObject = get( ‘sectionName’ )
sectionObject .put( 'optionName', ‘val1’ )
sectionObject .put( 'optionName', ‘val2’ )
sectionObject .put( 'optionName', ‘val3’ )
}
store()
}
I got ini file like this one:
[sectionName]
optionName = val3
But I want to get:
[sectionName]
optionName = val1
optionName = val2
optionName = val3
Could you please advice me how to resolve my issue? Thanks In Advance!
Update 1
I still waiting more elegant solution. But I created direct ini file editing below. Please provide me any feedback about it:
List newLines = []
File currentFile = new File( "test.ini" )
List currentLines = currentFile.readLines()
int indexSectionStart = currentLines.indexOf( 'sectionName' )
(0..indexSectionStart).each{
newLines.add( currentLines[ it ] )
}
List valuesList = 'val1,val2,val3'.split( ',' )
valuesList.each{
newLines.add( "optionName = $it" )
}
( indexSectionStart + 1 .. currentLines.size() - 1 ).each{
newLines.add( currentLines[ it ] )
}
File newFile = new File( "new_test.ini" )
if ( newFile.exists() ) newFile.delete()
newLines.each {
newFile.append( it+'\n' )
}
And simply delete old file and rename new one. I implemented it because I didn't find any insertLine() like methods in standart File
Right, how's this:
import org.ini4j.*
List valuesList = [ 'val1', 'val2', 'val3' ]
new File( "/tmp/test.ini" ).with { file ->
new Wini().with { ini ->
// Configure to allow multiple options
ini.config = new Config().with { it.multiOption = true ; it }
// Load the ini file
ini.load( file )
// Get or create the section
( ini.get( 'sectionName' ) ?: ini.add( 'sectionName' ) ).with { section ->
valuesList.each {
// Then ADD the options
section.add( 'optionName', it )
}
}
// And write it back out
store( file )
}
}
Currently I'm using the BouncyCastle library to generate a certificate. Something like this:
X509V3CertificateGenerator certGenerator = new X509V3CertificateGenerator();
certGenerator.setIssuerDN( rootCertificate.getSubjectX500Principal() );
certGenerator.setSignatureAlgorithm( "SHA1withRSA" );
certGenerator.setSerialNumber( serial );
certGenerator.setNotBefore( notBefore );
certGenerator.setNotAfter( notAfter );
certGenerator.setPublicKey( rootCertificate.getPublicKey() );
Hashtable<DERObjectIdentifier, String> attrs = new Hashtable<DERObjectIdentifier, String>();
Vector<DERObjectIdentifier> order = new Vector<DERObjectIdentifier>();
attrs.put( X509Principal.C, "RU" );
// other attrs.put() calls here
order.addElement( X509Principal.C );
// other order.addElement() calls here
certGenerator.setSubjectDN( new X509Principal( order, attrs ) );
certGenerator.addExtension( X509Extensions.AuthorityKeyIdentifier, false, new AuthorityKeyIdentifierStructure( rootCertificate ) );
certGenerator.addExtension( X509Extensions.SubjectKeyIdentifier, false, new SubjectKeyIdentifierStructure( newKeyPair.getPublic() ) );
return certGenerator.generate( rootPrivateKey, "BC" );
Can I add the SubjectAltNames field to the generated certificate?
To accomplish the task, insert the following just before the certGenerator.generate() call:
ASN1EncodableVector alternativeNames = new ASN1EncodableVector();
for( String domainName : domainNames )
{
alternativeNames.add( new GeneralName( GeneralName.dNSName, domainName ) );
}
certGenerator.addExtension( X509Extensions.SubjectAlternativeName, false, new GeneralNames( new DERSequence( alternativeNames ) ) );
(Answer provided by Double-V).