convert java code in c# - java

I am using this code for sentence similarties the code is available on java i want to use this in c#.
public static int getWordChanges(String s1, String s2) {
int similarityThreshold = 50;
int wordChanges = 0;
s1 = s1.toLowerCase().replace(".", "").replace(",", "").replace(";", "");
s2 = s2.toLowerCase().replace(".", "").replace(",", "").replace(";", "");
//Loop through each word in s1
for (int i = 0; i < s1.split(" ").length; i++) {
boolean exists = false;
//Search for i'th word in s1 in s2
for (int j = 0; j < s2.split(" ").length; j++) {
//Is the word misspelled?
if ((getLevenshteinDistance(s1.split(" ")[i], s2.split(" ")[j]) * 100 / s1.split(" ")[i].length()) < similarityThreshold) {
exists = true;
break;
}
}
//If the word does not exist, increment wordChanges
if (!exists) {
wordChanges++;
}
}
return wordChanges;
}
This is Java code i want to execute this code in c#
After convert the code in c#
public int getWordChanges(String s1, String s2)
{
int similarityThreshold = 50;
int wordChanges = 0;
s1 = s1.ToLower().Replace(".", "").Replace(",", "").Replace(";", "");
s2 = s2.ToLower().Replace(".", "").Replace(",", "").Replace(";", "");
//Loop through each word in s1
for (int i = 0; i < s1.Split(' ').Length; i++)
{
bool exists = false;
//Search for i'th word in s1 in s2
for (int j = 0; j < s2.Split(' ').Length; j++)
{
//Is the word misspelled?
if ((getLevenshteinDistance(s1.Split(' ')[i], s2.Split(' ')[j]) * 100 / s1.Split(' ')[i].Length()) < similarityThreshold)
{
exists = true;
break;
}
}
//If the word does not exist, increment wordChanges
if (!exists)
{
wordChanges++;
}
}
return wordChanges;
}
}
}
There are error at this line
if ((getLevenshteinDistance(s1.Split(' ')[i], s2.Split(' ')[j]) * 100 / s1.Split(' ')[i].Length()) < similarityThreshold)
on length error will show how i resolve this one

Add this function to your project
public static int getLevenshteinDistance(string s, string t)
{
int n = s.Length;
int m = t.Length;
int[,] d = new int[n + 1, m + 1];
// Step 1
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
// Step 2
for (int i = 0; i <= n; d[i, 0] = i++)
{
}
for (int j = 0; j <= m; d[0, j] = j++)
{
}
// Step 3
for (int i = 1; i <= n; i++)
{
//Step 4
for (int j = 1; j <= m; j++)
{
// Step 5
int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;
// Step 6
d[i, j] = Math.Min(
Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
Source
And change .Length() to .Length Because String.Length is a property and not a method

Related

Using Bubble Sort to Alphabetically Sort Array of Names in Java

I've been trying to tackle this bug for a while, but I can't get around to it. The purpose of the program below is to use bubble sort to alphabetically order an array of names. For example, if the names are ["Bob Joe", "Bob Frank", and "Bob Johnson"], the correctly sorted array would be ["Bob Frank", "Bob Joe", "Bob Johnson"].
The main challenge I am having is comparing any 2 strings past name.charAt(0). If I only compare the characters of any 2 strings at 1 specific index point, my code works. However, if I try to make the comparison move past index 0 if index 0 of both strings are equal to each other, my program no longer works.
The code is outlined below
public static void sortAlpha (String names[])
{
for (int i = 0 ; i < names.length - 1 ; i++)
{
for (int a = 0 ; a < names.length - 1 - i ; a++)
{
int length1 = names [a].length ();
int length2 = names [a + 1].length ();
int min = 1;
if (length1 > length2)
{
min = length2;
}
else
{
min = length1;
}
for (int b = 0 ; b < min ; b++)
{
if ((int) names [a].toLowerCase ().charAt (b) > (int) names [a + 1].toLowerCase ().charAt (b))
{
String tempName = names [a];
// sort:
names [a] = names [a + 1];
names [a + 1] = tempName;
break;
}
}
}
}
}
If I simply default the min value to 1, the code runs and does its intended job. However, if the min value stays dynamic, the program does not work. I'm trying to discern why this is so and what the fix is. Any help would be appreciated!
Check this out.
public static void sortAlpha(String names[]) {
for (int i = 0; i < names.length - 1; i++) {
for (int a = 0; a < names.length - 1 - i; a++) {
int lengthLeft = names[a].length();
int lengthRight = names[a + 1].length();
int minLength = lengthLeft > lengthRight ? lengthRight : lengthLeft;
for (int b = 0; b < minLength; b++) {
int letterLeft = (int) names[a].toLowerCase().charAt(b);
int letterRight = (int) names[a + 1].toLowerCase().charAt(b);
if (letterLeft > letterRight) {
String tempName = names[a];
// sort:
names[a] = names[a + 1];
names[a + 1] = tempName;
break;
} else if (letterLeft == letterRight) {
// if the letters are the same go for the next letters
continue;
} else {
// if it's already in the right position - stop.
break;
}
}
}
}
}
Use this
for (int i = 0; i < count; i++)
{
for (int j = i + 1; j < count; j++) {
if (str[i].compareTo(str[j])>0)
{
temp = str[i];
str[i] = str[j];
str[j] = temp;
}
}
}
You can simply use compareTo() and a temp variable to compare and store
Scanner sc = new Scanner(System.in);
String n[]= new String[5];
System.out.println("Enter the String");
for(int k = 0;k<5;k++) {
n[k] = sc.nextLine();
}
String temp;
System.out.println("sorted order:");
for (int j = 0; j < n.length; j++) {
for (int i = j + 1; i < n.length; i++) {
if (n[i].compareTo(n[j]) < 0) {
temp = n[j];
n[j] = n[i];
n[i] = temp;
}
}
System.out.println(n[j]);

How to check if a char is upper/lowercase?

The following code is supposed to convert letters to numbers and give the sum, but ignore any letters that are uppercase.
Example:
The input abcde should return 15. The input abCde should return 12.
Any help is appreciated.
static int strScore(String str[], String s, int n) {
int score = 0, index=0;
for (int i = 0; i < n; i++) {
if (str[i] == s) {
for (int j = 0; j < s.length(); j++)
score += s.charAt(j) - 'a' + 1;
index = i + 1;
break;
}
}
score = score * index;
return score;
}
public static void main(String[] args) {
String str[] = { "abcde" };
String s = "abcde";
int n = str.length;
int score = strScore(str, s, n);
System.out.println( score);
}
Use Character.isLowerCase(...).
So this is what your strScore method should look like:
static int strScore(String str[], String s, int n) {
int score = 0, index = 0;
for (int i = 0; i < n; i++) {
if (str[i].equals(s)) {
for (int j = 0; j < s.length(); j++) {
char c = s.charAt(j);
if(Character.isLowerCase(c)) // <-- This is the important part
score += c - 'a' + 1;
}
index = i + 1;
break;
}
}
score = score * index;
return score;
}
As pointed out in the comments, there is no need for the str and therfore neither the n parameter. This is a better version:
static int strScore(String s) {
int score = 0;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if(Character.isLowerCase(c))
score += c - 'a' + 1;
}
return score;
}
There are two things to address:
You have used == to compare strings. You need to use .equals
You need to put a check like if(s.charAt(j)>= 'a' && s.charAt(j)<'z')
for (int i = 0; i < n; i++) {
if (str[i].equals(s)) {
for (int j = 0; j < s.length(); j++)
if(s.charAt(j)>= 'a' && s.charAt(j)<'z') {
score += s.charAt(j) - 'a' + 1;
You can avoid passing String str[] = { "abcde" }; which has one element which equals s
to The method. You can also avoid passing n which is an simply str.length():
static int strScore(String s) {
int score = 0, index = 0;
for (int i = 0; i < s.length(); i++) {
for (char c : s.toCharArray()) {
if(c >= 'a' && c <'z') { //alternatively if(Character.isLowerCase(c))
score += c - 'a' + 1;
}
}
index = i + 1;
break;
}
score = score * index;
return score;
}

Integer variable does not update when if condition is true

public class test
{
static Scanner store = new Scanner(System.in);
public static void main(String[] args)
{
String str1 = args[0];
String str2 = args[1];
System.out.printf("%nThere are %d dissimilar characters in the two strings.%n", CountNotSim(str1, str2));
}
public static int CountNotSim(String str1, String str2)
{
String s1 = str1.toLowerCase();
String s2 = str2.toLowerCase();
char[] a1 = new char[s1.length()];
char[] a2 = new char[s2.length()];
for (int g = 0; g < s1.length(); g++)
a1[g] = s1.charAt(g);
for (int h = 0; h < s2.length(); h++)
a2[h] = s2.charAt(h);
int check = 0, stored;
char[] array = new char[26];
int ctr = s1.length() + s2.length();
for (int i = 0; i < a1.length; i++)
{
check = 0;
stored = 0;
for (int j = 0; j < a2.length; j++)
{
if (a1[i] == a2[j])
{
check++;
for (int k = 0; k < 26; k++)
{
if (array[k] == ' ')
if (stored == 0)
array[k] = a1[i];
if (a1[i] == array[k])
{
stored = 1;
break;
}
}
System.out.print(stored + "/ ");
}
}
if (check > 0)
{
if (stored == 0)
ctr -= (check + 1);
else if (stored == 1)
ctr--;
}
System.out.print(ctr + " "); //checker
}
System.out.println();
return ctr;
}
}
The program checks for dissimilar letters in two strings inputted from the command line. Variable "stored" is supposed to change to 1 whenever there's a match to avoid extra deductions to variable "ctr". However, for some reason, not only does "stored's" value not change, the array "array" also doesn't update its elements whenever there's a match. I'm at a loss on how to fix it--nothing looks incorrect.
You wrote this:
char[] array = new char[26];
...
for (int k = 0; k < 26; k++)
{
if (array[k] == ' ') {
...
But you simply set the length of array not its content.
As a char array, it's filled with the default char value, which is not the character space but the value 0 (not the character 0, but the numeric value 0)
So array[k] == ' ' will never be true.
Try with that:
for (int k = 0; k < 26; k++)
{
if (array[k] == 0) {
...

Text justification not formatting correctly

I'm trying to make a program that will format text entered so that each line is set to a specific Length and cant go over e.g. 20 and then have the characters format accordingly on each line and have "." pad the gaps to make up the set length.
This is the output I've got so far:
This.is..an..example
of..text..that..will
have..straight..left
and.right....margins
after formatting ...
For some reason the "." are not appearing between after and formatting as well as after the "g" a dot is missing a space is taking its place instead. It seems to always happen's on the last line.
This is what the output should look like:
This.is..an..example
of..text..that..will
have..straight..left
and.right....margins
after.formatting....
Code:
import java.util.*;
public class FormattedPadding {
public static ArrayList<String> fullJustify(ArrayList<String> a, int b) {
ArrayList<String> result = new ArrayList<String>();
if(a == null || a.size() == 0)
return result;
int i = 0;
int currentLength = 0;
String temp = "";
for(i = 0; i < a.size(); i++){
currentLength += a.get(i).length() + 1;
if(currentLength > b + 1) {
result.add(temp);
temp = "";
currentLength = 0;
i--;
//System.out.println("Intermediate result: " + result);
}
else
temp += a.get(i) + " ";
}
if(!temp.equals(""))
result.add(temp);
for(i = 0; i < result.size() - 1; i++){
temp = result.get(i);
String[] tempArray = temp.split(" ");
int totalLength = 0;
for(int j =0; j < tempArray.length; j++)
totalLength += tempArray[j].length();
int[] spaceCount = getSpaceCount(b-totalLength, tempArray.length);
for(int l =0; l < spaceCount.length; l++)
System.out.print(spaceCount[l] + " " );
System.out.println();
temp = "";
for(int j = 0; j < tempArray.length; j++){
temp += tempArray[j];
for(int k = 0; k < spaceCount[j]; k++)
temp += ".";
}
result.set(i, temp);
}
temp = result.get(result.size() - 1);
if(temp.length() < b){
while(temp.length() < b)
temp += ".";
}
else if(temp.length() > b)
temp = temp.substring(0, b);
result.set(result.size() - 1, temp);
return result;
}
public static int[] getSpaceCount(int freeSpace, int numOfStrings) {
int size = numOfStrings - 1;
int[] ret = new int[size + 1];
if(size == 0){
ret[0] = freeSpace;
}
else {
for(int i =0; i < ret.length; i++) {
if(size != 0){
ret[i] = freeSpace % size == 0 ? freeSpace/size : freeSpace/(size + 1);
}
freeSpace = freeSpace - ret[i];
size--;
}
}
return ret;
}
public static void main(String[] args) {
//ArrayList<String> a = new ArrayList<String>();
System.out.println("#Enter ");
String usrInput = BIO.getString();
String[] items = usrInput.split("\\s+"); // Split where whitespace is encounterd using the RegEx
ArrayList<String> newList = new ArrayList<String>(Arrays.asList(items));
// ^Split input into ArrayList
int b = 20; // Line length
ArrayList<String> result = fullJustify(newList, b);
for(int i =0; i < result.size(); i++) {
System.out.println(result.get(i));
}
System.out.println(result);
}
}
First of all, I think your code will go to infinite loop if any of the words has length more than "b"
About the issue, you create different logic for the last line
temp = result.get(result.size() - 1);
if(temp.length() < b){
while(temp.length() < b)
temp += ".";
}
else if(temp.length() > b)
temp = temp.substring(0, b);
result.set(result.size() - 1, temp);
You can remote those part, and change the loop from result.size()-1 to result.size(), so it will cover all lines:
for(i = 0; i < result.size(); i++){

Lucene: - indexing and finding unique terms

I have written a code in lucene, which firsts indexes xml documents, and finds the number of unique terms in the index.
Say there are n number (no.) of unique terms.
I want to generate a matrix of dimensions nXn, where
m[i][j] = (co_occurrence value of terms (i, j))/ (occurrence value of term i)
co_occurence of terms (i, j) = no. of documents in which ith term and jth terms, both are occurring
occurence of term j is the no. of documents in which the term j is occurring.
My code is working fine. But its not efficient. for large no. of files, where no. of terms are more than 2000, its taking more than 10 minutes.
here is my code for finding co_occurence -
int cooccurrence(IndexReader reader, String term_one, String term_two) throws IOException {
int common_doc_no = 0, finaldocno_one = 0, finaldocno_two = 0;
int termdocid_one[] = new int[6000];
int termdocid_two[] = new int[6000];
int first_docids[] = new int[6000];
int second_docids[] = new int[6000];
int k = 0;
for (java.util.Iterator<String> it = reader.getFieldNames(
FieldOption.ALL).iterator(); it.hasNext();) {
String fieldname = (String) it.next();
TermDocs t = reader.termDocs(new Term(fieldname, term_one));
while (t.next()) {
int x = t.doc();
if (termdocid_one[x] != 1) {
finaldocno_one++;
first_docids[k] = x;
k++;
}
termdocid_one[x] = 1;
}
}
/*
* System.out.println("value of finaldoc_one - " + finaldocno_one); for
* (int i = 0; i < finaldocno_one; i++) { System.out.println("" +
* first_docids[i]); }
*/
k = 0;
for (java.util.Iterator<String> it = reader.getFieldNames(
FieldOption.ALL).iterator(); it.hasNext();) {
String fieldname = (String) it.next();
TermDocs t = reader.termDocs(new Term(fieldname, term_two));
while (t.next()) {
int x = t.doc();
if (termdocid_two[x] != 1) {
finaldocno_two++;
second_docids[k] = x;
k++;
}
termdocid_two[x] = 1;
}
}
/*
* System.out.println("value of finaldoc_two - " + finaldocno_two);
*
* for (int i = 0; i < finaldocno_two; i++) { System.out.println("" +
* second_docids[i]); }
*/
int max;
int search = 0;
if (finaldocno_one > finaldocno_two) {
max = finaldocno_one;
search = 1;
} else {
max = finaldocno_two;
search = 2;
}
if (search == 1) {
for (int i = 0; i < max; i++) {
if (termdocid_two[first_docids[i]] == 1)
common_doc_no++;
}
} else if (search == 2) {
for (int i = 0; i < max; i++) {
if (termdocid_one[second_docids[i]] == 1)
common_doc_no++;
}
}
return common_doc_no;
}
code for calculation of knowledge matrix: -
void knowledge_matrix(double matrix[][], IndexReader reader, double avg_matrix[][]) throws IOException {
ArrayList<String> unique_terms_array = new ArrayList<>();
int totallength = unique_term_count(reader, unique_terms_array);
int co_occur_matrix[][] = new int[totallength + 3][totallength + 3];
double rowsum = 0;
for (int i = 1; i <= totallength; i++) {
rowsum = 0;
for (int j = 1; j <= totallength; j++) {
int co_occurence;
int occurence = docno_single_term(reader,
unique_terms_array.get(j - 1));
if (i > j) {
co_occurence = co_occur_matrix[i][j];
} else {
co_occurence = cooccurrence(reader,
unique_terms_array.get(i - 1),
unique_terms_array.get(j - 1));
co_occur_matrix[i][j] = co_occurence;
co_occur_matrix[j][i] = co_occurence;
}
matrix[i][j] = (float) co_occurence / (float) occurence;
rowsum += matrix[i][j];
if (i > 1)
{
avg_matrix[i - 1][j] = matrix[i - 1][j] - matrix[i - 1][0];
}
}
matrix[i][0] = rowsum / totallength;
}
for (int j = 1; j <= totallength; j++) {
avg_matrix[totallength][j] = matrix[totallength][j]
- matrix[totallength][0];
}
}
Please anyone suggest me any efficient method to implement it.
I think you can put the find process of term_one and term_two in one for loop. And you can use two hashsets to save the docid that you have found. And then use termOneSet.retainAll(termTwoSet) to get the doc which have both term_one and term_two.

Categories