I am writing a Java application that retrieves and presents data from an HBase database.
When writing the Get method for retrieving a row, I would like to get all the data for that row, but exclude the value for a particular column family (the "big" column family). Note: I need to retrieve the column names (qualifiers?) in that family because they contain valuable information.
Is it possible to write a Filter for that?
I have two solutions. The first one does not work and the second one is quite slow.
First solution (using a composite filter):
HTable table = getTable();
Get get = new Get(row);
FilterList filter = new FilterList(FilterList.Operator.MUST_PASS_ONE);
FilterList subFilterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
subFilterList.addFilter(new KeyOnlyFilter());
subFilterList.addFilter(new FamilyFilter(CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("big"))));
filter.addFilter(subFilterList);
filter.addFilter(new FamilyFilter(CompareOp.NOT_EQUAL, new BinaryComparator(Bytes.toBytes("big"))));
get.setFilter(filter);
retrieveAndUseResult(table, get);
This solution works neither conceptually nor in practise - but perhaps I am on the right track using a composite FilterList?
Second solution (using two gets):
HTable table = getTable();
Get get = new Get(row);
// exclude the entire "big" column family
get.setFilter(new FamilyFilter(CompareOp.NOT_EQUAL, new BinaryComparator(Bytes.toBytes("big"))));
retrieveAndUseResult(table, get);
Get get2 = new Get(row);
// include the "big" column family, but only retrieve the key
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
filterList.addFilter(new KeyOnlyFilter());
filterList.addFilter(new FamilyFilter(CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("big"))));
get2.setFilter(filterList);
retrieveAndUseResult(table, get2);
This works, but I would favor having to do only one get.
I ended up using a variant of the second solution - using two gets. But I used a batch get list to speed it up.
The code:
HTable table = getTable();
Get get = new Get(row);
// exclude the entire "big" column family
get.setFilter(new FamilyFilter(CompareOp.NOT_EQUAL, new BinaryComparator(Bytes.toBytes("big"))));
Get get2 = new Get(row);
// include the "big" column family, but only retrieve the key
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
filterList.addFilter(new KeyOnlyFilter());
filterList.addFilter(new FamilyFilter(CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("big"))));
get2.setFilter(filterList);
List<Get> getList = new ArrayList<Get>();
getList.add(get);
getList.add(get2);
retrieveAndUseResults(table, getList);
Related
I have recently tried to code a small java file which will insert a row into an already existing table in a .odt document. The table itself has 4 rows and 3 column, but I would like to implement a check which will expand that table if the content to be inserted is larger than 4. However, every time I try to get the table's rows, it returns a null pointer. I am not that familiar with UNO api, but as far as i read through the documentation, the class XColumnsAndRowRange should be used in this situation. My code is as follows:
XTextTablesSupplier xTablesSupplier = (XTextTablesSupplier) UnoRuntime.queryInterface(XTextTablesSupplier.class, xTextDocument);
XNameAccess xNamedTables = xTablesSupplier.getTextTables();
try {
Object table = xNamedTables.getByName(tblName);
XTextTable xTable = (XTextTable) UnoRuntime.queryInterface(XTextTable.class, table);
XCellRange xCellRange = (XCellRange) UnoRuntime.queryInterface(XCellRange.class, table);
if(flag){
XColumnRowRange xCollumnAndRowRange =(XColumnRowRange)
UnoRuntime.queryInterface(XColumnRowRange.class, xCellRange);
XTableRows rows = xCollumnAndRowRange.getRows();
System.out.println("Testing if this works");
rows.insertByIndex(4, size-4);
}
I am not sure if I am missing something here or if I should be using a different function.
As Lyrl suggested, this works:
XTableRows rows = xTable.getRows();
Apparently XColumnRowRange is only used for spreadsheets.
Note: With Basic or Python you would not have this problem, because those languages do not need queryInterface. The code would simply be:
table = tables.getByName(tblName)
rows = table.getRows()
I have a Java class to automate some behaviour on the web, my only problem is that now instead of the static data that I have I need to use the data from the csv.
for example:
this is one of the actions in my automation class:
WebElement supplierAddressField = driver.findElement(By.id("FieldaddressOfSupplierLine"));
supplierAddressField.sendKeys("hollywood blvd 34");
So now, instead of the static "supplier address" value I want to iterate on each line of the .sendKeys(csvLineMap.get("supplier address"));
Because in each line I dont need all the headers info, this is why I think it will be the best to just create a list of maps, that each map key will be the header of the csv and the value will be the value for this header in a specific line.
this is the structure of the csv:
Please help me to figure this out...thanksss!!
Apache Commons CSV
For what you are asking for I would recommend you look at Apache Commons CSV. One of the examples from their User Guide matches very closely with with the examples you are trying
Reader in = new FileReader("path/to/file.csv");
Iterable<CSVRecord> records = CSVFormat.EXCEL.parse(in);
for (CSVRecord record : records) {
String lastName = record.get("Last Name");
String firstName = record.get("First Name");
}
ok, this might be overly complex for what you want, but I always open csv's as excel files because then you can run down the columns. The code for picking up any column would look like this:
Workbook w = Workbook.getWorkbook(inputWorkbook);
Sheet sheet = w.getSheet(0);
nom = sheet.getRows();
String[][] SheetArray = new String [2][nom];
// change the first number to the number of columns you want,
// or pick up the number same as you did with rows
Cell cell;
// GETS DATA FROM SHEET AND RUNS THROUGH WHOLE LOOP BELOW FOR EACH REFERENCE
for(int j =0;j<sheet.getRows();j++) // cycles through rows and loads into 2d array
{ // start 6
cell = sheet.getCell(0, j); <- your column number here
cellcont = cell.getContents();
SheetArray[0][j] = cellcont;
// repeat the above block for each column you want
} // end 6
you now have a 2d array with all the info in it which you can handle however you want.
wrap the entire thing in a try .. catch.
With uniVocity-parsers you can parse only the fields you are interested, in any order:
CsvParserSettings parserSettings = new CsvParserSettings();
// Let's extract headers
parserSettings.setHeaderExtractionEnabled(true);
parserSettings.selectFields("Field 5", "Field 1");
//Rows will come organized according to your field selection
List<String[]> allRows = parser.parseAll("path/to/file.csv");
If you prefer, you can easily get a map with the values of all columns:
CsvParserSettings parserSettings = new CsvParserSettings();
// Let's extract headers
parserSettings.setHeaderExtractionEnabled(true);
// To get the values of all columns, use a column processor
ColumnProcessor rowProcessor = new ColumnProcessor();
parserSettings.setRowProcessor(rowProcessor);
CsvParser parser = new CsvParser(parserSettings);
//This will kick in our column processor
parser.parse(new FileReader("path/to/file.csv"));
//Finally, we can get the column values:
Map<String, List<String>> columnValues = rowProcessor.getColumnValuesAsMapOfNames();
Have a look. It is faster than any other parser and you can do much more, such as converting the values and generating java beans.
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).
I am trying to remove duplicates in a collection using java driver in mongodb.
I am using the code
table = db.getCollection("dummy_data_OLD");
BasicDBObject query = new BasicDBObject("url", 1)
.append("unique", true).append("dropDups", true);
table.createIndex(query);
It will create a unique index , but still duplicates present in the db.
Is there any problem in my code?
This creates an index on the fields url, unique and dropDups. When you want to create an index using options, you need to provide these as a second DBObject.
DBObject fields = new BasicDBObject("url", 1);
DBObject options = new BasicDBObject("unique", true).append("dropDups", true);
db.getCollection("dummy_data_OLD").createIndex(fields, options);
I'm currently working on a Java application that reads an Access file and builds a Jtable model using the data that's collected. I've previously done the same with an Excel file but when I tried with Jackcess it was slightly diffrent and I've ran into some questionmarks.
My work so far:
public class AccessModel{
public DefaultTableModel getAccessModel() throws IOException {
Database db = DatabaseBuilder.open(new File("MyFile.accdb"));
Vector<String> columnNames = new Vector<String>();
Vector<String> vector = new Vector<String>();
Vector<Vector<String>> data = new Vector<Vector<String>>();
StringBuilder output = new StringBuilder();
Table table = db.getTable("Table1");
for (Column column : table.getColumns()) { // get the table column names
output.append(column.getName());
output.append("\n");
columnNames.add(column.getName());
}
for (Column column : table.getColumns()) { // get the column rows and values
vector.add(column.getRowValue(table.getNextRow()).toString());
}
data.add(vector);
// return the model to Gui
DefaultTableModel accessModel = new DefaultTableModel(data, columnNames);
return accessModel;
}
}
As you can see this method will only iterate trough the first row, then exit the loop. I'm either blind to an abvious solution due to 12 hours of straight work, or I'm doing something terribly wrong.
I've stumbled across some half-good solutions where an Iterator is used, but I cannot get the hang of it. Any suggestions on this? Or should I stay on lane with my current line of thought?
JTable (value for view is stored in XxxTableModel, in your case is used DefaultTableModel) is row bases Object,
TableColumn (value is stored in TableColumnModel) to divide row(s) to the columns
you would need to create two Objects,
Vector<String> columnNames (is only one row) for columns identifiers from Table table = db.getTable("Table1");
loop inside Table table = db.getTable("Table1"); to fill two dimensional Vector<Vector<Object>> data = new Vector<Vector<Object>>(); by using Vector<Object> vector = new Vector<Object>();, notice 1st. code line insode loop must be vector = new Vector<Object>();, you have to create a new Vector otherwise you'll add the same rown_times, last code line should be data.add(vector)
.
everything (I'm still think so) is described in Oracle tutorial How to use Tables
This is how I create the spreadsheet:
DocsService client= new DocsService ("idea");
client.useSsl ();
client.setOAuthCredentials (oauthParameters, new OAuthHmacSha1Signer ());
DocumentListEntry newEntry= new com.google.gdata.data.docs.SpreadsheetEntry ();
newEntry.setTitle (new PlainTextConstruct ("GIdeaDB"));
DocumentListEntry insertedEntry= client.insert (new URL (
"https://docs.google.com/feeds/default/private/full/?xoauth_requestor_id="+ userEmail), newEntry);
Now I want to write the first line in it.
But unfortunately all API calls seam to base on the fact, that there already is a first line, for you insert name-value-pairs (where the name is the headline I want to create).
http://code.google.com/apis/spreadsheets/data/3.0/developers_guide.html#CreatingTableRecords
Any ideas how I can create the first line? The one which defines the field names.
Finaly found it. You have to do it cell by cell:
oauthParameters= new GoogleOAuthParameters ();
oauthParameters.setOAuthConsumerKey (CONSUMER_KEY);
oauthParameters.setOAuthConsumerSecret (CONSUMER_SECRET);
oauthParameters.setOAuthType (OAuthType.TWO_LEGGED_OAUTH);
oauthParameters.setScope ("https://spreadsheets.google.com/feeds/");
SpreadsheetService spreadsheetService= new SpreadsheetService ("appname");
spreadsheetService.useSsl ();
spreadsheetService.setOAuthCredentials (oauthParameters,
new OAuthHmacSha1Signer ());
URL feedUrl= new URL (
"https://spreadsheets.google.com"
+ "/feeds/spreadsheets/private/full?title=Spreadsheetname&xoauth_requestor_id="
+ userEmail);
SpreadsheetFeed resultFeed= spreadsheetService.getFeed (feedUrl,
SpreadsheetFeed.class);
List <SpreadsheetEntry> spreadsheets= resultFeed.getEntries ();
SpreadsheetEntry spreadsheetEntry= spreadsheets.get (0);
URL worksheetFeedUrl= spreadsheetEntry.getWorksheetFeedUrl ();
log.severe (worksheetFeedUrl.toString ());
WorksheetFeed worksheetFeed= spreadsheetService.getFeed (
worksheetFeedUrl, WorksheetFeed.class);
List <WorksheetEntry> worksheetEntrys= worksheetFeed.getEntries ();
WorksheetEntry worksheetEntry= worksheetEntrys.get (0);
// Write header line into Spreadsheet
URL cellFeedUrl= worksheetEntry.getCellFeedUrl ();
CellFeed cellFeed= spreadsheetService.getFeed (cellFeedUrl,
CellFeed.class);
CellEntry cellEntry= new CellEntry (1, 1, "headline1");
cellFeed.insert (cellEntry);
cellEntry= new CellEntry (1, 2, "headline2");
cellFeed.insert (cellEntry);
I haven't tried it, but it looks to me like that is described in the section on "Creating a table":
TableEntry tableEntry = new TableEntry();
FeedURLFactory factory = FeedURLFactory.getDefault();
URL tableFeedUrl = factory.getTableFeedUrl(spreadsheetEntry.getKey());
// Specify a basic table:
tableEntry.setTitle(new PlainTextConstruct("New Table"));
tableEntry.setWorksheet(new Worksheet("Sheet1"));
tableEntry.setHeader(new Header(1));
// Specify columns in the table, start row, number of rows.
Data tableData = new Data();
tableData.setNumberOfRows(0);
// Start row index cannot overlap with header row.
tableData.setStartIndex(2);
// This table has only one column.
tableData.addColumn(new Column("A", "Column A"));
tableEntry.setData(tableData);
service.insert(tableFeedUrl, tableEntry);
Specifically, the part tableEntry.setHeader(new Header(1)) seems like it creates a header on the first row. Then, tableData.setStartIndex(2) seems to specify that data shouldn't go in the first row (since it's the header). Finally, tableData.addColumn(new Column("A", "Column A")) seems to add a column that would be labeled in the header.