Getting 'trigrams' in Java - java

I am having a bit of an issue getting trigrams in Java. My program can currently get bigrams fine but when I try to implement the same structure of the method and change it to get trigrams it seems to not work as well.
I want the trigrams to get every possible combination of words within the arraylist, e.g.
Original = [eye, test, find, free, nhs]
Trigram = [eye test find, 2, eye test free, 3, eye test nhs, 4, eye find free, 3, eye find nhs, 4, eye free nhs, 5, etc...]
The numbers determine the distance between the first word and the last word and should get every combination of words of a 3 in the arraylist. This currently works fine for bigrams...
Original = [eye, test, find, free, nhs]
Bigram = [eye test, 1, eye find, 2, eye free, 3, eye nhs, 4, test find, 1, test free, 2, test nhs, 3, find free, 1, etc..]
Here are the methods
public ArrayList<String> bagOfWords;
public ArrayList<String> bigramList = new ArrayList<String>();
public ArrayList<String> trigramList = new ArrayList<String>();
public void trigram() throws FileNotFoundException{
PrintWriter tg = new PrintWriter(new File(trigramFile));
// CREATES THE TRIGRAM
for (int i = 0; i < bagOfWords.size() - 1; i++) {
for (int j = 1; j < bagOfWords.size() - 1; j++) {
for(int k = j + 1; k < bagOfWords.size(); k++){
int distance = (k - i);
if (distance < 4){
trigramList.add(bagOfWords.get(i) + " " + bagOfWords.get(j) + " " + bagOfWords.get(k) + ", " + distance);
}
}
}
}
public void bigram() throws FileNotFoundException{
// CREATES THE BIGRAM
PrintWriter bg = new PrintWriter(new File(bigramFile));
for (int i = 0; i < bagOfWords.size() - 1; i++) {
for (int j = i + 1; j < bagOfWords.size(); j++) {
int distance = (j - i);
if (distance < 4){
bigramList.add(bagOfWords.get(i) + " " + bagOfWords.get(j) + ", " + distance);
}
}
}
Can anyone help me alter the trigram() method to create an appropriate trigram for what I need?
Thanks for any help.

You want j to start at i+1, don't you? Also, I think you are letting i count to far. It should stop at bagOfWords.size() - 2. I am not sure why you check distance < 4. This will throw out valid groups.
public void trigram() throws FileNotFoundException{
PrintWriter tg = new PrintWriter(new File(trigramFile));
// CREATES THE TRIGRAM
for (int i = 0; i < bagOfWords.size() - 2; i++) {
for (int j = i + 1; j < bagOfWords.size() - 1; j++) {
for(int k = j + 1; k < bagOfWords.size(); k++){
int distance = (k - i);
trigramList.add(bagOfWords.get(i) + " " + bagOfWords.get(j) + " " + bagOfWords.get(k) + ", " + distance);
}
}
}

Answer of #bradimus is exactly right. I just gonna show another approach. Did you noticed, that your methods very similar? So, why not to try merge it to one universal method? Something like following:
public List<String> anygram(List<String> bagOfWords, int gramCount){
List<String> result = new ArrayList<String>();
for(int i=0;i<=bagOfWords.size()-gramCount; i++){
for(int j=i; j+gramCount<=bagOfWords.size(); j++){
StringBuilder builder = new StringBuilder();
builder.append(bagOfWords.get(i));
int k = j+1;
for(; k<j+gramCount; k++){
builder.append(" ");
builder.append(bagOfWords.get(k));
}
builder.append(", ").append(k-i-1);
result.add(builder.toString());
}
}
return result;
}
My answer is not for rating. I just became interested in this task, and come to this solution.

Related

how do I print my array using enhanced loops

So as my program runs, it does some calculations for the Fibonacci sequence. I can easily print these 20 Fibonacci sequences out in the for loop, but I just want to do the calculations to fill the Fib array up and do the printing out in an enhanced loop. How can I do this with an enhanced for loop?
public static void main(String[] args) {
int[] Fib = new int[20];
Fib[0] = 0;
Fib[1] = 1;
System.out.printf("%d %d", Fib[0], Fib[1]);
for (int i = 2; i < Fib.length; i++) {
Fib[2] = Fib[0] + Fib[1];
Fib[0] = Fib[1];
Fib[1] = Fib[2];
System.out.print(" "+Fib[2]);
for (int x : Fib) {
x = Fib[2];
System.out.print(" "+x);;
}
}
}
Your code contains a few mistakes, and major are:
Fib[2] = Fib[0] + Fib[1]; this should be Fib[i] = Fib[i - 1] + Fib[i - 2];
Another mistake is nesting for loops. The enhanced for loop should be after the loop with index i.
int[] Fib = new int[20];
Fib[0] = 0;
Fib[1] = 1;
for (int i = 2; i < Fib.length; i++) {
Fib[i] = Fib[i - 1] + Fib[i - 2];
}
for (int number : Fib) {
System.out.print(" " + number);
}
You can use Arrays.toString(Fib) instead of using a loop for printing the array.

Java - 2D arrays, checking for duplication

The variable "num" is a 2D array. I'm trying to check in that array, if there are any duplicates. "num" is a user-input.
I have extensively looked through Java documentation and asked my lectures and I can't get a working answer. I understand the concept, what I'm meant to do, but just can't get the coding right.
Here is my code:
for(int i = 0; i < 3; i++){ //3 rows with 5 numbers each
for(int j = 0; j < 5; j++){
num[i][j] = Integer.parseInt(JOptionPane.showInputDialog(null, "Enter value for line: " + i + " and position: "+ j ));
if((num[i][j] == num[i][0]) || (num[i][j] == num[i][1]) ||(num[i][j] == num[i][2]) || (num[i][j] == num[i][3]) || (num[i][j] == num[i][4])){
if(num[i][j] != 0){
num[i][j] = Integer.parseInt(JOptionPane.showInputDialog(null, "ERROR. Enter value for line: " + i + " and position: "+ j ));
}
}
}
}
I have also tried using HashSet, but I think that only works with 1D arrays.
I would like to use something like this, as I feel this I understand the most:
secret = new Random().ints(1, 40).distinct().limit(5).toArray();
But obviously not with Random.
I've tried this:
Set<Integer> check = new HashSet<>();
Random gen = new Random();
for(int i = 0; i < 3; i++){ // 3 rows, 5 numbers
for(int j = 0; j < 5; j++){
num[i][j] = Integer.parseInt(JOptionPane.showInputDialog(null, "Enter value for row " + i + " and position " + j));
check.add(gen.nextInt(num[i][j]));
}
}
This last section of coding (directly above this) compiles and runs, but doesn't check for duplicates.
There are alternative ways to checking for duplicates (e.g. you could loop back through the data you've entered previously into the 2D array in order to check for duplicate values) however here's how I'd go about using a Set to check for duplicates in order to, Are you trying to populate the 2d array with all unique values, where each value is from the user?? (also - knowing this explicitly in the original post would be very helpful, thanks to Michael Markidis for specifying that)
With a little UX knowledge here, separating the ERROR is def helpful to the end-user, as ERROR + re-input at the same time is confusing.
import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;
import javax.swing.JOptionPane;
public class App {
public static void main(String[] args) {
int[][] num = new int[3][5];
System.out.println("Before:");
for (int i = 0; i < 3; ++i)
System.out.println(Arrays.toString(num[i]));
Set<Integer> data = new HashSet<Integer>();
for (int i = 0; i < 3; i++) { // 3 rows with 5 numbers each
for (int j = 0; j < 5; j++) {
boolean isGoodInput = false;
while (!isGoodInput) {
String input = JOptionPane.showInputDialog(null, "Enter value for line: " + i + " and position: " + j);
Integer n = Integer.parseInt(input);
if (data.contains(n)) {
JOptionPane.showMessageDialog(null, "ERROR: Try again");
} else {
num[i][j] = n;
isGoodInput = data.add(n);
}
}
}
}
System.out.println("After:");
for (int i = 0; i < 3; ++i)
System.out.println(Arrays.toString(num[i]));
}
}
Note: the 2D array is limited to your specification in the original post as a 3x5, so you'd have to change these values in multiple places to make different sized arrays - perhaps making these more dynamic could speed up further development of this application in the future.
Here's one way to accomplish this where you use the hashset to track what has already been inserted into the 2D array:
int[][] num = new int[3][5];
Set<Integer> check = new HashSet<>();
for (int i = 0; i < 3; i++)
{ // 3 rows, 5 numbers
for (int j = 0; j < 5; j++)
{
int n = 0;
do
{
n = Integer.parseInt(JOptionPane.showInputDialog(null, "Enter value for row " + i + " and position " + j));
} while (!check.add(n)); // keep looping if it was in the hashset
// add it to the array since we know n is not a duplicate at this point
num[i][j] = n;
}
}

Java using for-loop to produce series of numbers

I am trying to write a collection of for-loops that produce the following series of numbers below. I am trying to accommodate my loops to print each series on the same line, with spaces between each term. I am new to java and got really confused on how exactly I can accomplish it. On the right side are the digits I am increasing the counting by.
1. 4 5 6 7 8 9 10 (+1)
2. 6 5 4 3 2 1 (-1)
3. 2 4 6 8 10 12 14 16 (+2)
4. 19 17 15 13 11 9 7 5 (-2)
5. 7 15 23 31 39 (+8)
6. 2 4 8 16 32 64 (*2)
Here is the code the way I tried to accomplish it. I got the first row to work but I'm wondering weather there's an easy way I can create the rest of them without re-duplicating the program.
import acm.program.*;
public class ppLoop extends ConsoleProgram {
public void run()
{
{
for(int row = 1; row < 2; row++)
{
print(" " + row + ". ");
for (int col = 4; col < 11; col++)
{
print(row*col + " ");
} //col values
println( );
} //row values
}
}
}
I am new to java and right now going over for-loops and trying to accomplish this in for-loop. If someone could help me out, I would really appreciate it.
Thank you!
Edit:
Here is what happens when I increase the number of rows.
Edit:
Here is the solution of what I had tried accomplishing. Thanks to everyone who helped me.
import acm.program.*;
public class ppLoop extends ConsoleProgram
{
public void run()
{
{
for(int i = 1; i < 2; i++) // One iteration of outer loop
{
print(i + ". "); // print row number 1
for (int j = 4; j < 11; j++) // loop for row 1
{
print(j + " ");
}
println( );
print((i + 1) + ". ");
for (int j = 6; j > 0; j--) // loop for row 2
{
print(j + " ");
}
println();
print((i + 2) + ". ");
for (int j = 2; j < 17; j = j + 2) // loop for row 3
{
print(j + " ");
}
println();
print((i + 3) + ". ");
for (int j = 19; j > 4; j = j - 2) // loop for row 4
{
print(j + " ");
}
println();
print((i + 4) + ". ");
for (int j = 7; j < 40; j = j + 8) // loop for row 5
{
print(j + " ");
}
println();
print((i + 5) + ". ");
for (int j = 2; j < 65; j = j * 2) // loop for row 6
{
print(j + " ");
}
println();
}
} //close outer loop
} //close public run
} //close console program
You can perform this program with a series of nested loops. I have done the first three rows. I took out your package and used a main method. Also, your indentation was very confusing. Since your increment changes each line, I don't know of a way to make it any shorter than this using for loops.
public class ppLoop{
public static void main(String[] args)
{
{
for(int i = 1; i < 2; i++) // One iteration of outer loop
{
System.out.print(i + ". "); // print row number
// you can use the same variable for each inner loop
for (int j = 4; j < 11; j++) // loop for row 1
{
System.out.print(j + " ");
}
System.out.println( );
System.out.print((i + 1) + ". ");
for (int j = 6; j > 0; j--) // loop for row 2
{
System.out.print(j + " ");
}
System.out.println();
System.out.print((i + 2) + ". ");
for (int j = 2; j < 17; j = j + 2) // loop for row 3
{
System.out.print(j + " ");
}
}
}
}
}
You could create a method that takes:
1. A start number
2. What math operation to perform (add, subtract, or multiply)
3. What number to increment/decrement or multiply by
4. An end number
It would look similar to this:
public void formattedFor(int startNum, String operation, int num, int endNum) {
if (operation.equals("add")) {
for (int i = startNum; i < endNum; i += num) {
System.out.print(i + " ");
}
}
if (operation.equals("sub")) {
for (int i = startNum; i > endNum; i -= num) {
System.out.print(i + " ");
}
}
else if (operation.equals("mult")) {
for (int i = startNum; i < endNum; i *= num) {
System.out.print(i + " ");
}
}
System.out.println( );
}
If I'm understanding the problem, you want to print six series that each start with a different number and increment/decrement that number by some value. Since I see no relationship between the initial value and the increment/decrement, you're going to have to write six separate for loops.
If you're absolutely averse to this, you can store your initial values, your increments/decrements, and your final values in an array and iterate through them using a for loop, an if statement (to deal with the multiplication) and a while loop. The array would look like this:
int[][] values = new int[][] {
{4, 6, 2, 19, 7, 2},
{1, -1, 2, -2, 8, 2},
{10, 1, 16, 5, 39, 64}
};
I could write up the source based on this, but it's not what you asked for.
I strongly suspect that, if this is a homework assignment and you've modified the problem, there's something you've failed to understand about the problem itself. If this is meant to have an simple solution that uses for loops, there should probably be some logic that binds the rows together, unless you're allowed to use arrays/while loops/for loops/objects and methods.
On another note, you should format your code differently. It's somewhat difficult to read right now. In general, indent things that happen inside loops, classes, or functions. For example:
import acm.program.*;
public class ppLoop extends ConsoleProgram {
public void run() {
for(int row = 1; row < 2; row++) {
print(" " + row + ". ");
for (int col = 4; col < 11; col++) {
print(row*col + " ");
} //col values
println( );
} //row values
}
}

Creating word pairs, triplets etc for evaluation in Bleu

I need to create a list of word pairs, triplets etc for evaluation in the Bleu metric. Bleu starts with unigrams (a single word) and goes up to N-grams - the N being specified at runtime.
For example, given the sentence
"Israeli officials are responsible for airport security"
For unigrams it would just be a list of the words. For bigrams it would be
Israeli officials
officials are
are responsible
responsible for
for airport
airport security
The relevant trigrams are
Israeli officials are
officials are responsible
are responsible for
responsible for aiport
for airport security
I've coded a working Bleu that hard codes the NGrams to 4 and brute forces the calculations of the unigrams etc. It's ugly as hell, and besides, I need to be able to supply the N at run time.
The snippet that's trying to generate the pairs / triplets etc -
String current = "";
int temp = 0;
for (int i = 0; i < goldWords.length - N_GRAM_ORDER; i++) {
current = current + ":" + goldWords[i];
while (temp < N_GRAM_ORDER) {
current = current + ":" + goldWords[temp + i];
temp++;
}
goldNGrams.add(current);
current = "";
temp = 0;
}
}
Edit - so the output from this snippet should be for bigrams -
israeli:officials
officials:are
are:responsible
responsible:for
for:airport
airport:security
Where goldWords is a String array containing the individual words to be made into NGrams.
I've been tinkering with this loop for days, drawing out the relationships etc and it just won't click for me. Can anyone see what I'm doing wrong?
I would change this:
String current = "";
int temp = 0;
for (int i = 0; i < goldWords.length - N_GRAM_ORDER; i++) {
current = current + ":" + goldWords[i];
while (temp < N_GRAM_ORDER) {
current = current + ":" + goldWords[temp + i];
temp++;
}
goldNGrams.add(current);
current = "";
temp = 0;
}
}
to this:
String current = "";
for (int i = 0; i < goldWords.length(); i++){
for (int j = 0; j < N_GRAM_ORDER; j++){
if (i + j < goldWords.length())
current += ":" + goldWords[i + j];
}
goldNGrams.add(current);
current = "";
}
So, the outer for loop iterates through the first word to be included, the inner loop iterates through all the words to be included. One thing to note is that the if statement is used to prevent an array out of bounds error. This should be moved to outside the inner for loop if you only want complete n-grams.
With the if statement where it is you will get:
Israeli:officials
officials:are
are:responsible
responsible:for
for:airport
airport:security
security
If you want:
Israeli:officials
officials:are
are:responsible
responsible:for
for:airport
airport:security
instead, try this code:
String current = "";
for (int i = 0; i < goldWords.length(); i++){
if (i + N_GRAM_ORDER < goldWords.length()){
for (int j = 0; j < N_GRAM_ORDER; j++){
current += ":" + goldWords[i + j];
}
}
goldNGrams.add(current);
current = "";
}
(the above code is done without checking it against the compiler, so there might be an Off By One or minor syntax error in it. Validate it, but it will get you close).
Here's an alternative that uses a String[] to collect the ngrams instead of a string. I changed the number of iterations on the outer for loop to ensure it captures the last n-gram.
public static List<String[]> ngrams(String[] gold, int n_length) {
List<String[]> list = new ArrayList<String[]>();
for (int i = 0; i < gold.length - (n_length-1); i++) {
String[] ngram = new String[n_length];
for(int j = 0; j < n_length; j++) {
ngram[j] = gold[i+j];
}
list.add(ngram);
}
return list;
}
according to the N_GRAM programming output
int N_GRAM_ORDER = 3, temp = 0, i;
for (i = 0; i <= goldWords.length - N_GRAM_ORDER; i += N_GRAM_ORDER) {
while (temp < N_GRAM_ORDER) {
current = current + ":" + goldWords[temp + i];
temp++;
}
goldGrams.add(current);
current = "";
temp = 0;
}
if ((temp + i) < goldWords.length) {
temp += i;
while (temp < goldWords.length) {
current = current + ":" + goldWords[temp++];
}
goldGrams.add(current);
}
}
output
Israeli:officials:are
responsible:for:airport
security

Java algorithm for all-pairs

Given a collection of integers, what's a Java algorithm that will give all pairs of items as follows..
Given the example collection: [1,3,5], we'd want the output:
[1-1]
[3-3]
[5-5]
[1-3]
[1-5]
[3-5]
Note that ordering is not important, so we want one of [1-3], [3-1] but not both.
This should work with a collection of n numbers, not just the the three numbers as in this example.
Below function should do this
private void printPermutations(int[] numbers) {
for(int i=0;i<numbers.length; i++) {
for (int j=i; j<numbers.length; j++) {
System.out.println("[" + numbers[i] + "-"+ numbers[j] +"]");
}
}
}
Example call to this function
int[] numbers={1,2,3};
printPermutations(numbers);
Sounds like homework...but here it is anyway. Obviously you can do without an ArrayList, etc. - just quick and dirty.
import java.util.ArrayList;
public class Test {
public static void main(String[] args) {
int[] input = {1, 3, 5};
ArrayList<String> output = new ArrayList<String>();
int n = input.length;
for (int left = 0; left < n; left++) {
output.add("["+input[left]+"-"+input[left]+"]");
for (int right = left + 1; right < n; right++) {
output.add("["+input[left]+"-"+input[right]+"]");
}
}
System.out.println(output.toString());
}
}
Here's the logic you want.
function subsequences (arr) {
arr.sort ();
var subseqs = [];
for (var i = 0; i < arr.length; ++i) {
for (var j = i; j < arr.length; ++j) {
subseqs.push ("" + arr [i] + "-" + arr [j]);
}
}
return subseqs;
}

Categories