Unintentional thread synchronization when writing to file - java

If I have a tread that counts to 1000 and prints out the result like so:
public class MyRunnable implements Runnable{
#Override
public void run() {
for (int i=0;i<1000;i++){
System.out.println(i+" ");
}
}
}
And in main I do something like:
new Thread(new MyRunnable()).start();
new Thread(new MyRunnable()).start();
the results get jumbled up, since I'm not synchronizing.
But, if my thread looks like:
public class MyRunnable implements Runnable{
Writer w;
MyRunnable(Writer w){
this.w=w;
}
#Override
public void run() {
for (int i=0;i<1000;i++){
try{
w.write(String.valueOf(i));
w.write (" ") ;
}//&catch the exception
}
}
}
and my in main I have this this:
try (Writer writer = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("Thread.txt"), "utf-8"))) {
Thread t1 =new Thread(new MyRunnable(writer));
Thread t2 =new Thread(new MyRunnable(writer));
t1.start();
t2.start();
t1.join();
t2.join();
} //catch the exception,etc.
The results look normal: the threads somehow got synchronized. I imagine it's because I pass writer as a parameter, but could someone explain it better? And, out of curiosity, is there any way around it?
EDIT: My apologies, it seems i hurried too much. I tried to simplify MyThread and removed too much code. but i just realized that the situation i described only happens if i try to open and read a file in run() method. So run actually looks like this:
public void run() {
try {
Scanner scanner = new Scanner(f); //f is new File (filename)
while (scanner.hasNext())
{
String str = scanner.next();
if (str.equals("word")) //code counts "word" appearance
count++;
}
scanner.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
for (int i=0;i<1000;i++){
try{
w.write(String.valueOf(i));
w.write (" ") ;
}//&catch the exception
} //continue by writing the count
I suppose this means the problem is actually in reading the file?

The write(String) method of Writer (which BufferedWriter extends from) is synchronized.

This one will produce output in a non-synchronized manner. It is similar to yours, so I think there is something you are omitting.
import java.io.BufferedWriter;
import java.io.IOException;
import java.io.Writer;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
/**
*/
public class ThreadTest {
final static int N = 1000;
static Runnable countDown(Writer writer){
return ()->{
for(int i = 0; i<N; i++){
try{
writer.write(String.format("%d\t", i));
} catch(Exception exc){
}
}
try {
writer.write("\n");
} catch (IOException e) {
e.printStackTrace();
}
};
}
public static void main(String[] args){
try(
Writer writer = Files.newBufferedWriter(Paths.get("test.txt"), StandardCharsets.UTF_8, StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING)
){
Thread a = new Thread(countDown(writer));
Thread b = new Thread(countDown(writer));
a.start();
b.start();
a.join();
b.join();
} catch (IOException e) {
e.printStackTrace();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
Then the output becomes:
0 0 1 1 2 2 3 3 4 4 5 6 5 6 7 7 8 8 9 10 9 10 11 11 12 13 12 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 32 31 33 32 34 33 35 34 36 37 35 38 36 39 37 40 38 41 39 42 40 43 41 42 44 43 45 44 46 45 47 46 48 47 49 50 48 49 51 50 52 53 51 54 52 55 56 53 57 54 58 55 59 56 60 57 61 58 62 59 63 60 64 61 62 65 63 66 64 67 65 68 66 69 67 70 68 71 69 72 70 73 71 74 72 75 73 76 74 77 75 78 76 77 79 78 80 79 81 80 82 81 83 84 82 85 83 86 84 87 85 88 86 89 87 90 88 91 89 92 90 93 91 94 92 95 93 96 97 94 98 95 99 96 100 97 101 98 102 99 103 100 104 101 105 102 106 103 107 104 108 105 109 106 110 107 111 108 112 109 113 110 114 111 115 112 116 113 117 114 118 115 119 116 120 117 121 118 122 119 123 120 124 121 125 122 126 123 127 124 128 125 129 126 130 127 131 128 132 129 133 130 134 131 135 132 136 133 137 134 138 135 139 136 140 137 141 138 142 139 143 140 144 141 145 142 146 143 147 144 148 145 149 146 150 147 151 148 152 149 153 150 154 151 155 152 156 153 157 154 158 155 159 156 160 157 158 161 162 159 163 160 164 161 165 162 166 163 167 164 168 165 169 166 170 167 171 168 172 169 173 170 174 171 175 172 176 173 177 174 178 175 179 176 180 177 181 178 182 179 183 180 184 181 185 182 186 183 187 184 188 185 189 186 190 187 191 188 192 189 193 190 191 194 192 195 193 196 194 197 195 198 196 199 197 200 198 201 199 202 203 200 204 201 205 202 206 203 207 204 208 209 205 210 206 211 207 212 213 208 214 209 215 210 216 211 217 212 218 213 219 214 220 215 221 216 222 217 223 218 224 219 225 220 226 221 227 222 228 223 229 224 230 225 231 226 232 227 233 234 228 235 236 229 237 230 238 231 239 232 233 240 234 241 235 242 236 243 237 244 238 245 239 246 247 240 241 248 242 249 243 250 244 251 245 252 253 246 247 254 248 255 249 256 250 257 258 251 259 252 253 260 261 254 262 255 256 257 263 258 264 259 265 260 266 261 267 262 268 263 269 264 270 265 271 266 272 267 273 268 274 269 275 270 276 271 277 272 278 273 279 274 280 275 281 276 282 277 283 278 284 279 285 280 286 281 287 282 288 283 289 284 290 285 291 286 292 287 293 288 294 289 295 290 296 291 297 292 298 293 299 294 300 295 301 296 302 297 303 298 304 299 305 300 306 301 307 302 308 303 309 304 310 305 311 306 312 307 313 308 314 309 315 310 316 311 312 317 313 318 314 319 315 320 316 321 317 318 322 323 319 324 320 325 321 326 322 327 323 328 324 329 325 330 326 331 327 332 328 333 329 334 330 335 331 336 332 337 333 338 334 339 335 340 336 341 337 342 338 343 339 344 340 345 341 346 342 347 343 348 344 349 345 350 346 351 347 352 348 353 349 354 350 355 351 352 356 353 357 354 358 355 359 356 360 357 361 358 362 359 363 364 360 365 361 366 362 363 367 364 368 365 369 366 370 367 371 368 372 369 373 370 374 371 375 372 376 373 377 374 378 375 379 376 380 377 381 378 382 379 383 380 384 381 385 382 386 383 387 388 384 389 385 390 386 391 392 387 393 388 394 395 389 396 390 397 398 399 400 391 401 392 402 403 393 404 394 405 406 395 407 408 396 409 397 410 411 398 412 399 413 414 400 415 401 416 417 402 418 403 419 420 404 421 405 422 423 406 424 407 425 426 408 427 409 428 410 429 430 411 431 432 412 433 413 434 414 435 415 436 416 437 417 438 418 439 419 440 420 441 421 442 422 443 423 444 424 445 425 446 426 447 427 448 428 449 429 450 430 451 431 452 432 453 433 454 434 455 435 456 436 457 437 458 438 459 439 460 440 461 441 462 442 463 443 464 444 465 445 466 446 467 447 468 448 469 449 470 450 471 451 472 452 473 453 474 454 475 455 476 456 477 457 478 458 479 480 459 481 482 460 483 484 461 485 486 462 487 463 488 489 464 490 491 465 492 466 493 494 467 495 496 468 497 469 498 499 470 500 501 471 502 472 503 504 473 505 474 506 507 475 508 476 509 510 477 511 478 512 513 479 514 480 515 516 481 517 482 518 519 483 520 484 521 522 523 485 524 486 525 526 487 527 488 528 529 489 530 531 490 532 491 533 492 534 535 493 536 537 494 538 495 539 540 496 541 497 542 543 498 544 499 545 546 500 547 548 501 549 502 550 551 503 552 504 553 554 505 555 506 556 557 507 558 508 559 560 509 561 562 510 563 511 564 565 512 566 513 567 568 514 569 515 570 516 571 572 517 573 518 574 519 575 576 520 577 521 578 579 522 580 581 523 582 524 583 525 584 585 526 586 527 587 588 528 589 529 590 591 530 592 593 531 594 532 595 596 533 597 534 598 599 535 600 536 601 602 537 603 538 604 605 539 606 540 607 541 608 609 542 610 543 611 612 544 613 614 545 615 546 616 617 547 618 548 619 620 549 621 550 622 623 551 624 552 625 626 553 627 554 628 629 555 630 631 556 632 557 633 634 558 635 636 559 637 560 638 639 561 640 641 562 642 563 643 644 564 645 565 646 647 566 648 567 649 650 568 651 652 569 653 654 570 655 571 656 657 572 658 659 573 660 574 661 662 575 663 664 576 665 577 666 667 578 668 579 669 670 580 671 672 581 673 582 674 675 583 676 584 677 678 585 679 680 586 681 682 587 683 684 588 685 589 686 590 687 591 592 688 593 594 689 595 690 596 691 597 692 598 693 599 694 600 695 696 601 697 602 698 603 699 604 700 605 701 606 702 607 703 608 704 609 705 610 706 611 707 612 708 613 709 710 614 711 615 712 616 713 617 714 618 715 619 716 620 717 621 718 622 719 720 623 721 722 624 723 625 724 725 626 726 627 727 728 628 729 629 730 731 630 732 631 733 734 632 735 633 736 737 634 738 635 739 740 636 741 637 742 743 638 744 639 745 746 640 747 641 748 749 642 750 643 751 752 644 753 645 754 755 646 756 647 757 758 648 759 649 760 761 650 762 763 651 764 765 652 766 653 767 768 654 769 655 770 771 656 772 657 773 774 658 775 659 776 777 660 778 661 779 780 662 781 663 782 783 664 784 665 785 786 666 787 788 667 789 668 790 669 791 792 670 793 671 794 795 672 796 673 797 798 674 799 800 675 801 676 802 677 803 804 678 805 806 679 807 808 680 809 681 810 811 682 812 683 813 814 684 815 685 816 817 686 818 687 819 820 688 821 689 822 823 690 824 691 825 826 692 827 693 828 829 694 830 695 831 832 696 833 697 834 835 698 836 699 837 838 700 839 701 840 841 702 842 703 843 704 844 705 845 706 846 707 847 708 848 709 849 710 850 711 851 712 852 853 713 854 714 855 715 856 716 857 717 858 718 859 719 860 720 861 721 862 722 863 723 864 724 865 725 866 726 867 727 868 728 869 729 870 730 871 731 872 732 873 733 874 734 875 735 876 736 877 737 878 738 879 739 880 740 881 741 882 742 883 743 884 744 885 745 886 746 887 747 748 888 889 749 890 891 750 892 893 751 894 895 752 896 753 897 898 754 899 900 755 901 902 756 903 757 904 905 758 906 907 759 908 760 909 910 761 911 912 762 913 914 763 915 916 764 917 765 918 919 766 920 921 767 922 923 768 924 769 925 926 770 927 928 771 929 772 930 931 773 932 774 933 934 775 935 936 776 937 777 938 939 778 940 779 941 780 942 943 781 944 945 782 946 783 947 948 784 949 785 950 951 786 952 787 953 954 788 955 789 956 957 790 958 791 959 960 792 961 793 962 794 963 964 795 965 796 966 797 967 798 968 969 799 970 971 800 972 801 973 974 802 975 976 803 977 804 978 979 805 980 981 806 982 807 983 984 808 809 985 986 810 987 811 988 812 813 989 990 814 991 815 992 816 993 817 818 994 819 995 996 820 997 821 998 822 823 999
824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999
Which, as you can see ends up jumbled.
In the case of your Scanner, since you are making two Scanner for the same file on windows you are essentially synchronizing on the file. eg. The read/write condition ends up looking like this:
try{
synchronized(f){
Scanner s = new Scanner(f);
System.out.println("starting");
while(s.hasNext()){
System.out.println(s.next().length());
}
System.out.println("finished");
s.close();
}
}catch(Exception e){e.printStackTrace();}
The result of this is that, one thread reads the file, finishes and starts counting, at that point the other thread starts reading the file. Reading the file takes much longer than counting down.

It's not clear what you want exactly. But if you just want t1 and t2 to mix up their numbers you should make sure they start exactly at the same time. An easy way to do that is to use a CountdownLatch that you store in your threads like this:
public class MyRunnable implements Runnable{
Writer w;
CountDownLatch countDownLatch;
MyRunnable(Writer w, CountDownLatch countDownLatch){
this.w=w;
this.countDownLatch=countDownLatch;
}
#Override
public void run() {
countDownLatch.await();
for (int i=0;i<1000;i++){
try{
w.write(String.valueOf(i));
w.write (" ") ;
}//&catch the exception
}
}
}
And that you create before the thread and launch when they are created:
try (Writer writer = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("Thread.txt"), "utf-8"))) {
CountDownLatch countDownLatch = new CountDownLatch(1);
Thread t1 =new Thread(new MyRunnable(writer, countDownLatch));
Thread t2 =new Thread(new MyRunnable(writer), countDownLatch);
t1.start();
t2.start();
countDownLatch.countDown(); //Start the threads
t1.join();
t2.join();
} //catch the exception,etc.
This will make sure that the time taken for the thread creation doesn't influence your test and that both threads actually start at the same time.

Related

How to make 10x10 printf table display consistently in java?

I'm trying to display a 10x10 table of an ArrayList in console using printf.
like this:
930 396 466 242 315 254 217 820 287 216
595 220 13 494 186 645 309 902 560 56
797 980 201 301 479 694 509 778 702 360
253 995 647 725 327 774 861 420 37 753
948 107 935 867 399 818 73 427 485 70
575 385 174 400 940 296 76 569 362 732
21 197 948 421 852 954 640 528 119 659
96 55 475 5 903 940 299 45 432 79
352 363 698 873 130 704 89 245 45 288
646 378 967 179 94 607 261 710 504 20
The code works sometimes, however it will randomly make a new line or fail to make one.
public static void main(String[] args) {
// This creates an ArrayList of 100 random integers between the values from 1 to 1000
int listSize = 100;
ArrayList<Integer> list = createList(listSize);
displayList(list);
}
private static void displayTable(ArrayList<Integer> list) {
System.out.println();
for (Integer num : list) {
if (list.indexOf(num) % 10 == 0) {
System.out.printf("%n%5d", num);
} else {
System.out.printf("%5d", num);
}
}
System.out.println();
}
Here are two examples of outputs I've gotten:
969 500 12 256 945 105 402 868 213 658
909 144 165 217 828 628 395 682 816 199
769 220 218 958 97 237 36 92 220 712
332 640 547 893 210 926 868 486 914 307
740 962 109 745 347 896 74 922 686 593
26 964 677 321 889 690 956 892 720 915
631 90 824 338 887 822 49 529 521 841
504 946 302 253 175 107 765 225 6 101
747 841 143 642 533 662 143 528 733 209
377 366 928 511 404
26 296 946 597 717
673 257 970 480 595 936 1000 490 937 45
156 619 722 237 448 611 266 603 84 421
719 8 341 720 284 170 885 740 587 686
182 111 533 268 455 804 494 14 161 38
612
612 235 758 366 607 354 591 914 791
277 426 318 204 692 851 727 654 696 7
504 801 213 368 834 928 141 951 714 340
190 325 129 930 923 654 997 903 569 867
956 736 712 586 560 770 497 875 854 618
998 281 953 747 828 212 844 314 494 367
As #akuzminykh said, you have an issue with how you deal with your random number index in your array.
To pe precise, this line:
if (list.indexOf(num) % 10 == 0) {
Will return the index of the FIRST num in your array, so if your num corresponds to a previous breakline index (in your case, %10), you will have one new line out of nowhere!
Try using a hand made index with your for loop. Or a counter.
Look at your example:
969 500 12 256 945 105 402 868 213 658
909 144 165 217 828 628 395 682 816 199
769 220 218 958 97 237 36 92 220 712
332 640 547 893 210 926 868 486 914 307
740 962 109 745 347 896 74 922 686 593
26 964 677 321 889 690 956 892 720 915 <--- Starting with 26
631 90 824 338 887 822 49 529 521 841
504 946 302 253 175 107 765 225 6 101
747 841 143 642 533 662 143 528 733 209
377 366 928 511 404
26 296 946 597 717 <--- Starting with 26

Trying to detect cycle in graph using BFS but some test cases not passing

The first function isCycle is getting parameters from driver code of website. I have worked for more than 2 hours in order to debug why its not working in my logic. Anything i need to improve here.
This is the link of question.
https://practice.geeksforgeeks.org/problems/detect-cycle-in-an-undirected-graph/1/#
public boolean isCycle(int V, ArrayList<ArrayList<Integer>> adj) {
if(V<=2){
return false;
}
Map<Integer, ArrayList<Integer> > graph = new TreeMap<>();
for(int i = 0; i<V; i++){
graph.put(i,adj.get(i) );
}
return detectCycle(graph);
}
static boolean detectCycle(Map<Integer, ArrayList<Integer> > graph){
Map<Integer, Integer> parentChild = new TreeMap<>();
Map<Integer, Boolean> isVisited = new TreeMap<>();
Queue<Integer> q = new LinkedList<>();
q.add(0);
isVisited.put(0,true);
parentChild.put(0,-1);
while(!q.isEmpty() ){
int curNode = q.remove();
if(graph.get(curNode) == null){
q.add(++curNode);
continue;
}
ArrayList<Integer> curElements = graph.get(curNode);
if(!isVisited.containsKey(curNode)){
q.add(curNode);
isVisited.put(0,true);
parentChild.put(0,0);
}
for(int temp: curElements){
if(!isVisited.containsKey(temp)){
isVisited.put(temp, true);
parentChild.put(temp,curNode);
q.add(temp);
}
else{
if((parentChild.get(temp) != curNode) && (isVisited.get(temp) == true)){
return true;
}
}
}
}
return false;
}
In this test case it failed. Short testcases are passing actually.
1805 829
1 321
1 499
1 694
1 1360
2 115
2 881
3 1592
4 846
5 1085
6 1654
7 691
7 872
7 1172
7 1243
7 1721
9 844
9 984
12 724
13 353
13 895
13 1107
15 1514
20 1679
22 329
22 759
23 124
23 1003
24 1239
30 1032
31 1004
31 1342
31 1641
33 466
34 393
35 518
38 1666
39 831
40 362
41 1615
42 1006
43 886
45 428
46 1084
47 178
48 1177
49 1564
51 1283
52 1042
54 593
55 1456
57 1460
58 806
64 521
64 741
65 86
65 1325
66 759
66 1773
67 704
67 769
67 1541
69 244
70 896
70 1470
72 263
72 1460
73 1013
74 465
74 496
75 733
77 218
77 436
77 1105
79 1664
82 221
82 577
82 714
84 485
84 1768
85 1313
87 166
88 510
88 1266
90 139
92 679
92 1196
93 441
93 1157
94 1417
96 441
96 1716
97 768
97 1578
100 1047
102 1508
105 724
106 1356
106 1756
110 801
114 862
114 1188
115 892
115 1787
116 782
117 1178
118 572
118 1593
120 1156
121 128
123 494
127 456
127 1591
129 659
131 444
133 1352
134 146
134 1319
135 1049
136 735
137 349
137 1799
138 1042
139 375
139 887
140 170
142 453
142 1703
143 911
144 976
145 1415
146 1548
148 1581
149 396
149 546
149 1602
150 1548
150 1703
151 828
151 1404
153 406
158 513
160 187
161 292
164 1009
164 1465
165 786
166 817
168 1493
169 1320
170 211
170 615
174 1798
175 1300
176 1079
177 599
177 715
177 747
178 999
179 691
179 869
179 870
180 883
181 1318
183 233
186 1323
188 837
188 1089
189 720
189 1799
192 891
192 1514
195 913
198 1604
199 926
201 1004
201 1239
203 1109
204 841
204 1088
206 1110
207 1523
209 584
209 1649
210 738
211 773
212 915
212 1060
214 623
214 1313
215 318
215 1082
217 303
218 241
221 1247
223 1740
225 1351
226 353
229 1298
232 1558
233 1287
234 1309
236 624
238 1539
239 911
241 1145
242 787
243 991
244 1234
245 1195
246 657
246 1113
247 1339
248 342
249 454
252 1781
253 619
254 1354
254 1531
255 1508
258 1617
261 419
261 1728
262 530
262 1678
263 562
263 984
263 1288
263 1631
263 1683
265 738
265 982
265 1454
266 1613
271 828
276 1486
279 1719
280 549
280 1259
283 1722
284 1117
289 1301
290 419
291 494
291 1180
291 1738
292 527
294 686
294 1667
294 1702
294 1752
296 1318
300 1597
303 356
305 425
305 1071
305 1726
306 780
307 1098
307 1236
308 469
308 662
311 1520
313 1386
315 1374
323 888
326 1062
330 839
330 1586
336 415
337 645
338 1606
340 1443
341 1572
343 1380
344 569
344 1062
345 1227
346 1026
346 1066
347 1685
348 1404
349 1470
351 407
353 780
355 748
357 1532
358 1610
359 551
359 1674
360 1013
361 1443
362 726
362 768
362 1430
366 509
366 1617
370 1318
370 1334
377 1582
378 933
380 556
383 1553
385 470
385 1493
389 986
389 1738
390 1071
391 1268
392 941
393 1793
394 1504
394 1537
398 1022
398 1089
399 1540
400 1124
400 1619
401 1086
404 1585
410 491
410 561
410 856
413 1651
417 1260
418 1425
418 1723
420 1595
421 1063
423 559
423 1383
424 1572
425 861
425 1775
427 1584
428 1672
432 708
435 783
435 1037
435 1576
436 856
436 1022
437 1731
442 635
443 835
444 629
445 1135
446 743
446 836
448 918
450 1582
454 1136
455 767
458 1769
459 928
459 1448
462 1745
463 1161
466 789
466 1144
468 1799
470 971
472 565
472 709
472 810
474 542
474 854
477 654
479 848
482 1385
484 642
485 1207
489 1476
489 1788
490 1569
494 1087
495 617
498 1594
500 571
501 1747
506 720
506 1069
507 912
507 962
508 921
508 1343
509 1143
511 1291
514 1557
515 849
517 554
518 1141
520 1022
520 1354
520 1607
522 907
523 902
527 972
527 1170
527 1254
528 1051
531 606
531 860
535 1177
536 957
539 913
541 698
541 1213
543 1782
544 1242
544 1429
544 1575
547 789
548 1086
550 560
554 1759
555 801
557 1706
559 1551
561 1163
569 1167
571 859
571 1336
572 1135
573 1704
576 990
579 847
579 1684
580 1153
580 1794
581 1123
581 1201
582 1226
586 1315
588 1135
589 1414
594 1084
594 1284
595 1766
598 1650
602 1715
604 1168
605 1337
607 1429
610 816
610 1263
616 1435
619 1281
620 1180
624 704
625 860
625 1242
626 959
626 1171
627 1425
629 1382
630 878
630 1211
631 1494
633 916
638 927
638 975
638 1773
639 842
640 695
640 787
642 802
644 1333
645 900
650 890
653 1724
654 890
654 1214
655 1226
655 1712
657 1388
658 1208
659 840
659 1519
663 892
664 1523
668 1336
672 1062
675 745
677 1561
678 1060
680 1255
681 1475
681 1655
682 708
682 781
683 1486
687 1609
690 1618
690 1679
691 903
693 1084
695 719
696 1310
696 1357
699 908
705 758
714 1745
718 782
718 978
719 1682
723 978
723 1762
724 1454
731 1017
732 1393
736 1730
739 1764
740 1674
742 1618
748 1068
748 1071
748 1202
748 1569
749 853
749 1427
751 1625
754 1506
755 1615
757 1663
759 1132
760 789
760 1024
760 1575
760 1771
764 819
766 1021
766 1369
767 1424
770 1751
773 1363
773 1402
775 1372
777 1515
779 1519
780 1702
781 1170
782 1137
782 1353
785 1058
786 983
786 1080
787 1721
789 929
792 1392
792 1784
793 1246
795 1782
799 937
801 959
804 1480
805 1776
809 941
812 1275
813 1554
814 1388
817 904
817 1290
817 1345
818 1044
819 1014
820 1346
821 1607
822 1093
831 1209
835 1555
836 1138
844 1004
846 1223
852 974
852 1177
853 1121
858 870
860 1079
863 1261
867 1442
873 1763
876 1704
877 1301
877 1501
877 1643
883 1342
884 1745
889 1723
890 1376
894 1016
896 1132
896 1340
901 1628
903 1586
906 1388
910 1057
910 1082
912 1206
912 1413
912 1536
917 1670
918 1295
919 1105
925 1110
926 1716
927 1442
928 1335
931 1751
934 1337
939 1573
941 1559
948 1646
950 1314
951 1504
953 1675
954 1563
955 1522
955 1648
956 1229
956 1309
960 1232
962 1406
962 1672
964 998
964 1420
965 1661
967 1650
969 1771
972 1494
972 1744
973 1544
975 1464
977 1055
977 1226
978 1377
978 1790
979 1326
979 1545
980 1450
981 1157
982 1605
983 1130
986 1292
986 1769
987 1692
990 1788
994 1460
1000 1337
1003 1549
1008 1528
1009 1277
1009 1465
1013 1301
1013 1512
1013 1671
1014 1094
1017 1572
1025 1507
1025 1666
1032 1613
1033 1590
1035 1213
1037 1304
1039 1139
1040 1349
1044 1460
1046 1497
1058 1601
1067 1178
1067 1403
1069 1724
1070 1722
1071 1299
1072 1299
1073 1132
1079 1602
1081 1281
1082 1211
1088 1359
1091 1171
1092 1745
1107 1441
1109 1248
1109 1524
1109 1574
1113 1157
1116 1611
1116 1671
1117 1194
1119 1340
1120 1507
1120 1511
1124 1238
1127 1152
1127 1450
1129 1722
1133 1409
1133 1707
1135 1242
1135 1530
1137 1503
1139 1672
1142 1716
1143 1219
1144 1246
1149 1654
1153 1599
1156 1533
1157 1584
1174 1197
1174 1676
1175 1555
1176 1396
1176 1797
1182 1476
1185 1736
1192 1589
1198 1791
1199 1732
1206 1561
1206 1765
1218 1242
1219 1695
1220 1722
1223 1314
1224 1484
1235 1245
1237 1547
1239 1727
1239 1769
1251 1431
1251 1453
1253 1459
1255 1415
1255 1474
1255 1785
1256 1794
1258 1365
1260 1700
1265 1535
1271 1441
1281 1544
1291 1732
1302 1379
1308 1326
1308 1773
1313 1431
1318 1414
1323 1657
1324 1730
1333 1597
1337 1411
1337 1804
1343 1575
1345 1508
1353 1753
1356 1711
1358 1750
1359 1523
1361 1796
1368 1539
1369 1514
1378 1400
1382 1755
1383 1550
1389 1464
1391 1520
1398 1702
1399 1550
1404 1507
1410 1778
1412 1764
1417 1441
1421 1608
1422 1439
1427 1511
1431 1610
1438 1509
1441 1461
1442 1506
1445 1626
1445 1649
1446 1507
1446 1622
1450 1660
1453 1756
1459 1603
1462 1546
1463 1646
1463 1755
1475 1799
1478 1595
1484 1629
1488 1702
1493 1762
1494 1656
1497 1629
1497 1804
1520 1724
1523 1666
1528 1790
1529 1681
1530 1614
1533 1735
1549 1621
1550 1589
1553 1672
1556 1688
1569 1698
1570 1621
1570 1747
1582 1629
1584 1622
1600 1657
1611 1699
1621 1768
1632 1794
1634 1648
1640 1644
1653 1792
1656 1765
1671 1672
1684 1735
1686 1783
1711 1803
1728 1756
1738 1777
1761 1766
1772 1780
Well, assuming our goal is not to find optimal solution, but to solve problem using BFS...
Next code snippet:
Queue<Integer> q = new LinkedList<>();
q.add(0);
isVisited.put(0,true);
parentChild.put(0,-1);
suffers from the following issue: in case of BFS we need to traverse graph in very specific order, basically we need to find some concept of level and always traverse from one level to another. Think why you code fails following simple test cases (yep, both geeksforgeeks and hackerrank failed to provide reliable test cases):
4, {{}, {2,3}, {1,3}, {1,2}} - returns false, expected true
4, {{3}, {3}, {3}, {0,1,2}} - returns true, expected false
One BFS idea: if our graph has a cycle, disconnecting leaf nodes (removing corresponding edges and splitting single graph into two) does not affect that cycle, so if we iteratively disconnected all leaf nodes (that is our level) and our graph still has edges that means we got a cycle.
public static boolean isCycle(int V, List<List<Integer>> adj) {
// queue contains leaf nodes only
Set<Integer> queue = new HashSet<>();
List<Set<Integer>> graph = new ArrayList<>();
// preparing input data
for (int node = 0; node < V; node++) {
List<Integer> input = adj.get(node);
Set<Integer> neighbours = input != null ? new HashSet<>(input) : new HashSet<>();
neighbours.remove(node);
graph.add(node, neighbours);
if (neighbours.size() == 1) {
// leaf node found, adding to queue
queue.add(node);
}
}
while (!queue.isEmpty()) {
// next BFS level
Set<Integer> level = new HashSet<>();
for (int node : queue) {
for (int neighbour : graph.get(node)) {
// removing edge neighbour->node
Set<Integer> next = graph.get(neighbour);
next.remove(node);
if (next.size() == 1) {
// got another one leaf node among neighbours
level.add(neighbour);
}
}
// removing edges node->neighbours
graph.set(node, new HashSet<>());
}
queue = level;
}
return graph.stream().anyMatch(set -> set.size() > 0);
}
However, if your BFS idea was to traverse the single graph level by level, the code could be (I see no difference with DFS though):
public static boolean isCycle(int V, List<List<Integer>> adj) {
boolean[] seen = new boolean[V];
for (int i = 0; i < V; i++) {
if (seen[i]) {
continue;
}
Deque<int[]> q = new LinkedList<>();
// storing (node, parent)
q.add(new int[]{i, -1});
seen[i] = true;
while (!q.isEmpty()) {
int[] node = q.poll();
for (int next : adj.get(node[0])) {
// seen node on the previous level
if (next == node[1]) {
continue;
}
if (seen[next]) {
return true;
}
q.add(new int[]{next, node[0]});
seen[next] = true;
}
}
}
return false;
}

Exploring horizontal parsing TJ in pdf(Detail understanding of tx formula)?

Yeah, I know this is repeated question. But still I need to understand a lot about horizontal parsing. I am expecting full clear answer here.
I have some example content stream like below:
Example1:
BT
/F33 20.665 Tf
72 633.8289 Td
[(Chapter)-375(12)]TJ
/F33 24.78709 Tf
0 51.30099 Td
[(P)31(arametric)-375(and)-375(P)32(olar)-375(Curv)31(es)]TJ
Example2:
BT
/C0_1 14 Tf
39.812999 681.73999 Td
[(\000"\000M\000U\000I\000P\000V\000H\000I\000\001)-82(\000$\000B\000S\000P\000V\000T\000F\000M\000\0001)-82(\000X\000B\000T\000\001)-82.07099........]TJ
The formula for parsing horizontally(tx) is
Now I want to substitute values behalf of example1 :
W0 = ?(here mkl mentioned w0 means width of the respective character from the width array. How can I get the lengths. what are the different values for above 3 examples. How can I get from existing pdf. How can I get character width from CMAP's.)
Tj = The numbers in TJ array.
Tfs = use the font size from the graphics state which is the font size parameter from the relevant Tf operation, e.g. 10.
Tc = use the value from the graphics state which is the parameter from the relevant Tc or " operation.
Tw = use 0 or (in case of a single-byte character code 32) the value from the graphics state which is the parameter from the relevant Tw or " operation.
Th = use the value from the graphics state which is the parameter from the relevant Tz operation divided by 100.
Please write step by step solution for each example and if possible, explain with all types of TJ arrays(what types we may see in content stream) PDF's are using. I read the concept from PDF32000_2008(9.4.4 Text Space Details) still I am in confusion sate. You can found the actual pdf's in below link
Example 1 file
Example2 file
It sounds like you most of all wonder where to retrieve the widths, the w0 values from.
This actually is easy, the widths arrays are in the PDF font objects! In case of simple fonts the width values are in the Widths array. The only exception are the standard 14 fonts. For them a PDF processor is expected to know the widths of the glyphs. In case of CID fonts the width values are in the W array defaulting to the DW value defaulting to 1000.
In case of Type 1, TrueType, and CID fonts the widths are measured in units in which 1000 units correspond to 1 unit in text space.
In case of Type 3 fonts these widths shall be interpreted in glyph space as specified by FontMatrix; but as a note there indicates a common practice is to define glyphs in terms of a 1000-unit glyph coordinate system, in which case the font matrix is [0.001 0 0 0.001 0 0] which gives rise to the same 1000:1 ratio as above.
Your first example
/F33 20.665 Tf
72 633.8289 Td
[(Chapter)-375(12)]TJ
Here the font F33 is selected with font size 20.665 in the first instruction. That font is defined in object 21:
24 0 obj
[656.2 625 625 937.5 937.5 312.5 343.7 562.5 562.5 562.5 562.5 562.5 849.5 500 574.1 812.5 875 562.5 1018.5 1143.5 875 312.5 342.6 581 937.5 562.5 937.5 875 312.5 437.5 437.5 562.5 875 312.5 375 312.5 562.5 562.5 562.5 562.5 562.5 562.5 562.5 562.5 562.5 562.5 562.5 312.5 312.5 342.6 875 531.2 531.2 875 849.5 799.8 812.5 862.3 738.4 707.2 884.3 879.6 419 581 880.8 675.9 1067.1 879.6 844.9 768.5 844.9 839.1 625 782.4 864.6 849.5 1162 849.5 849.5 687.5 312.5 581 312.5 562.5 312.5 312.5 546.9 625 500 625 513.3 343.7 562.5 625 312.5 343.7 593.7 312.5 937.5 625 562.5 625 593.7 459.5 443.8 437.5 625 593.7 812.5 593.7 593.7 500]
endobj
21 0 obj
<<
/BaseFont /HLFPHX+CMBX12
/FirstChar 11
/FontDescriptor 22 0 R
/LastChar 122
/Subtype /Type1
/Type /Font
/Widths 24 0 R
>>
Thus, the glyph with code 11 has a width of .6562, the one with code 12 is .625 units wide, etc.
So at the beginning the text matrix and text line matrix point to (0, 0). After 72 633.8289 Td they point to (72, 633.8289). This is where 'C' is drawn.
Drawing 'C' advances the position the text matrix points to by a tx value of
((w0 - Tj/1000) × Tfs + Tc + Tw) * Th
The 'C' we see in the instruction parameter actually is the byte 0x43 = 67. Thus in the Widths array we find the w0 value at 1000:1 at index 56 (0-based), 812.5. There are no numeric parameters immediately following the 'C', so Tj is 0. Tfs is 20.665. Tc and Tw both are 0. Th is 1.
Thus, tx is ((.8125 - 0) × 20.665 + 0 + 0) × 1 = 16.7903125 and drawing 'C' advances the position the text matrix points to to (88.7903125, 633.8289). This is where 'h' is drawn.
Similarly drawing 'h' advances the position by tx = ((.625 - 0) × 20.665 + 0 + 0) × 1 = 12.915625 to (101.7059375, 633.8289). This is where 'a' is drawn.
Drawing 'a' advances the position by tx = ((.5469 - 0) × 20.665 + 0 + 0) × 1 = 11.3016885 to (113.007626, 633.8289). This is where 'p' is drawn.
Drawing 'p' advances the position by tx = ((.625 - 0) × 20.665 + 0 + 0) × 1 = 12.915625 to (125.923251, 633.8289). This is where 't' is drawn.
Drawing 't' advances the position by tx = ((.4375 - 0) × 20.665 + 0 + 0) × 1 = 9.0409375 to (134.9641885, 633.8289). This is where 'e' is drawn.
Drawing 'e' advances the position by tx = ((.5133 - 0) × 20.665 + 0 + 0) × 1 = 10.6073445 to (145.571533, 633.8289). This is where 'r' is drawn.
Drawing 'r' considering the numeric parameter -375 advances the position by tx = ((.4595 - (-375/1000)) × 20.665 + 0 + 0) × 1 = 17.2449425 to (162.8164755, 633.8289). This is where '1' is drawn.
Drawing '1' advances the position by tx = ((.5625 - 0) × 20.665 + 0 + 0) × 1 = 11.6240625 to (174.440538, 633.8289). This is where '2' is drawn.
Drawing '2' advances the position by tx = ((.5625 - 0) × 20.665 + 0 + 0) × 1 = 11.6240625 to (186.0646005, 633.8289).
Following this the instruction /F33 24.78709 Tf changes the text font size to 24.78709 and the instruction 0 51.30099 Td advances the position of the text line matrix and text matrix to (72, 582.52791). This is where 'P' is drawn.
Drawing 'P' considering the numeric parameter 31 advances the position by tx = ((.7685 - (31/1000)) × 24.78709 + 0 + 0) × 1 = 18.280478875 to (90.280478875, 582.52791). This is where 'a' is drawn.
...
Your second example
/C0_1 14 Tf
39.812999 681.73999 Td
[(\000"\000M\000U\000I\000P\000V\000H\000I\000\001)-82(\000$\000B\000S\000P\000V\000T\000F\000M\000\001)-82(\000X\000B\000T\000\001)-82.07099........]TJ
Here the font C0_1 is selected with font size 14. This font is composite, its descendant font is defined in object 24:
24 0 obj
<<
/BaseFont /NFAHTB+MinionPro-Regular
/CIDSystemInfo 30 0 R
/DW 1000
/FontDescriptor 31 0 R
/Subtype /CIDFontType0
/Type /Font
/W [0 [500 227 276 318]
4 5 480 6 [756 711 223]
9 10 346
11 [404 580 228 356 228 331]
17 26 480 27 28 228 29 [552 580 552 379 753 691 588 665 735 568
529 715 766 341 329 673 538 891 743 747
563 745 621 474 617 736 703 971 654 634
603 345 333 345 566 500 224 439 508 423
528 425 296 468 534 268 256 496 253 819
547 510 524 511 371 367 305 531 463 685
472 459 420 347 263 347 580 276]
97 98 480 99 [159]
100 101 480 102 [477 480 169 398 444]
107 108 279 109 [535 533 520 490 489 226 497 390 239 429
401 445 970 1062 379]
124 136 400 137 [922 869 305 550 749 973 334 671 268 273
513 770 545 341 580 512 459 737 762 580
549 762 580 263 343 514 762 341 321 580
505 580 341 702]
171 176 691 177 [661]
178 181 568 182 185
341 186 [743]
187 191 747 192 [474]
193 196
736 197 198 634 199 [603]
200 205 439 206
[421]
207 210 425 211 214 268 215 [547]
216
220 510 221 [367]
222 225 531 226 227 459
228 [420 503 500 480 418]
233 238 762 239 [691 926 666 627 737 736 766 613 518 637
606 499 1029 763 493 267 526 541 533 525
547 303 385 669 1071 914 876 722 803 561
1071 1081 798 787 1045 801 852 814 535 520
778 533 582 522 856 664 804 814 533 777]
289 290 533
291 [578]
292 293 800 294 298 480 299 [828 439 790 565 511 531 584 482 456 565
621 306 297 558 460 709 580 584 484 585
528 408 510 582 567 761 551 511 493 611
621 306 582 510 579 611 481 431 815 723
776 268 606 603 622 242 235 345 346 530
340 446 406 486 403 499 437 466 486 473
468 529 486 481 489 528 483 481 519 710
1009 711 493 338 465 452 497 454 495 464
475 488 493 480 479 574 480 482 480 568
483 486 482]
392 411 486 412 [305 349 355]
415 416 292 417 [306 372 194 192 543 371 334 262 265 228]
427 436 341 437 [178 177]
439 440 341 441 [259]
442 443 245 444 453 341 454 [178 177]
456 457
341 458 [259]
459 460 245 461 470 341 471
[178 177]
473 474 341 475 [259]
476 477 245 478
487 341 488 [178 177]
490 491 341 492 [259]
493
494 245 495 497 606 498 [454 469 407 563]
502 507 691
508 [1058 813]
510 512 691 513 520 766 521 [566 766]
523 526 757 527 [640 757 598 681]
531 532 652 533 534
877 535 536 631 537 540 757 541 542 510
543 [256]
544 545 846 546 [753 922 520 276 444 445]
552 553 279
554 [356 379]
556 557 347 558 559 345 560 561
346 562 [226]
563 564 579 565 [586 587 760 556 375 490 718 561 536 641
757 531 568]
578 580
691 581 [722]
582 585 665 586 [735]
587 591
568 592 596 715 597 [766]
598 602 341 603
[329 673]
605 608 538 609 [891]
610 613 743 614
616 747 617 [749]
618 620 621 621 [474 477]
623
624 474 625 626 617 627 629 736 630 [733]
631 632 736 633 636 971 637 639 634 640
641 603 642 [869]
643 644 1071 645 647 439
648 [512]
649 652 423 653 [528]
654 657 425
658 [424]
659 663 468 664 [534]
665 668 268
669 [258 496]
671 673 253 674 [271 819]
676 679 547
680 682 510 683 [513]
684 686 371 687 [367 366]
689 690 367 691 692 305 693 698 531 699
702 685 703 705 459 706 707 420 708 [671 367]
710 711 492 712 724 400 725 728 565 729
[723]
730 731 565 732 [568]
733 734 565 735
[643]
736 737 531 738 [528]
739 740 531 741
[584]
742 749 482 750 [487]
751 755 565 756
[621]
757 760 306 761 [474]
762 763 306 764
[308 306 297 558]
768 771 460 772 [478 709]
774 778 580 779
785 584 786 [582 584]
788 790 528 791 792 408
793 [412]
794 795 408 796 797 510 798 804
582 805 [584]
806 807 582 808 811 761 812
816 511 817 819 493 820 [401 402 401 381 401 375 404 400 401 400
401 400 367 401 691 588 507 641 568 603
766 739 341 673 686 891 743 607 747 738
563 598 617 655 754 654 725 757 691 568
766]
861 862 341
863 [747]
864 865 655 866 [757]
867 873 691
874 882 910 883 887 691 888 895 568 896
901 766 902 910 972 911 914 766 915 926
341 927 932 757 933 941 1007 942 945 757
946 953 747 954 [563 341]
956 963 655 964 [341 329 889 959 776 650 653 741 691 580
588 512 649 568 954 518]
980 981 752 982 [650 645 891 766 747 735 563 665 617 523
510 495 497 403 381 509 490 245]
1000 1001 493 1002 [512 476 404 510 501 515 446 481 587 467
605 645 403 497 496 582 665 404 508 669
544 453 523 403 509]
1027 1028 245 1029 [510]
1030 1031 481 1032 [645 245 481]
1035 1042 523 1043 1048 403 1049 1056 509 1057
1064 245 1065 1070 510 1071 1078 481 1079 1086
645 1087 1088 523 1089 1090 403 1091 1092 509
1093 1094 245 1095 1096 510 1097 1098 481 1099
1100 645 1101 1108 523 1109 1116 509 1117 1124
645 1125 1130 523 1131 1135 509 1136 1141 245
1142 1145 481 1146 1147 501 1148 [481]
1149 1153
645 1154 [523 481]
1156 1159 230 1160 1171 400 1172
[353]
1173 1177 400 1178 1179 405 1180 [400 653 767 654 741 666 958 960 720 840
581 644 956 636 439 501 486 389 490 425
726 408]
1202
1203 555 1204 [500 494 640 553 510 552 524 423 441 459
672 472 556 507 771 775 566 681 468 440
707 500 425 500 389 449 367]
1231 1232 268 1233 [256 673 719 533 500 468 545 689 547 736
511 680 467 477 366 428 356 411 872 974
1124 1133 957 457 603 623 830 1006 806 1408
1744 1095 643 566 821 836 906 1602 1675 1584
427 892]
1275
1276 745 1277 [465 619 776 427 341 566 892]
1284 1287 400 1288 [747 736 525 547]
1292
1293 480 1294 1305 691 1306 1313 568 1314 1315
341 1316 1327 747 1328 1334 736 1335 1337 634
1338 1349 439 1350 1357 425 1358 1359 268 1360
1366 510 1367 1371 525 1372 1373 531 1374 1378
547 1379 1381 459 1382 1393 637 1394 1401 606
1402 1413 565 1414 1421 482 1422 1423 306 1424
1436 584 1437 1444 582 1445 1447 511 1448 1457
400 1458 [392]
1459 1480 400 1481 [565 511 531 584 482 456 565 621 306 297
558 460 709 580 584 484 585 528 408 510
582 567 761 551 511 493 611 621 582 510
579 611 481 431 723 776 603]
1518 1521
565 1522 [723]
1523 1524 565 1525 [568]
1526 1528
565 1529 1530 531 1531 [528]
1532 1533 531 1534
[584]
1535 1542 482 1543 [487]
1544 1548 565 1549
[621]
1550 1556 306 1557 [308 306 297 558]
1561 1564 460 1565
[478 709]
1567 1571 580 1572 1578 584 1579 [582 584]
1581
1583 528 1584 1585 408 1586 [412]
1587 1588 408
1589 1590 510 1591 1597 582 1598 [584]
1599 1600
582 1601 1604 761 1605 1609 511 1610 1612 493
1613 1624 565 1625 1632 482 1633 1634 306 1635
1647 584 1648 1655 582 1656 1658 511 1659 [477 366 617 305 356 227 400 159 226 306
159]
1670 1671 105 1672 [495 565 762 916 297 223 480 461 480 486
480 472 468 486]
]
>>
endobj
Thus, the glyph with code 0 has a width of 500, code 1 has 227, code 2 has 276, code 3 has 318, code 4 and 5 have 480, code 6 has 756, etc.
Furthermore, it is important that the font encoding is Identity-H which is a pure two-byte encoding.
Thus, at the beginning the text matrix and text line matrix point to (0, 0). After 39.812999 681.73999 Td they point to (39.812999, 681.73999). This is where '\000"' = 0x0022 is drawn.
Drawing 0x0022 advances the position by tx = ((.691 - 0) × 14 + 0 + 0) × 1 = 9.674 to (49.486999, 681.73999). This is where '\000M' = 0x004d is drawn.
...
As you can see, the cases in your example files are simple:
no use of character or word spacing, no horizontal scaling;
very simple text and text line matrices, plain translation;
no changes to the current transformation matrix;
no standard 14 fonts;
...
Nonetheless, the concept should have become clear.
And I hope I have not miscalculated too often... ;)

Full content is not written into a file

I've the below set of data in my text file.
10
100
3
5 75 25
200
7
150 24 79 50 88 345 3
8
8
2 1 9 4 4 56 90 3
542
100
230 863 916 585 981 404 316 785 88 12 70 435 384 778 887 755 740 337 86 92 325 422 815 650 920 125 277 336 221 847 168 23 677 61 400 136 874 363 394 199 863 997 794 587 124 321 212 957 764 173 314 422 927 783 930 282 306 506 44 926 691 568 68 730 933 737 531 180 414 751 28 546 60 371 493 370 527 387 43 541 13 457 328 227 652 365 430 803 59 858 538 427 583 368 375 173 809 896 370 789
789
65
591 955 829 805 312 83 764 841 12 744 104 773 627 306 731 539 349 811 662 341 465 300 491 423 569 405 508 802 500 747 689 506 129 325 918 606 918 370 623 905 321 670 879 607 140 543 997 530 356 446 444 184 787 199 614 685 778 929 819 612 737 344 471 645 726
101
5
722 600 905 54 47
35
51
210 582 622 337 626 580 994 299 386 274 591 921 733 851 770 300 380 225 223 861 851 525 206 714 985 82 641 270 5 777 899 820 995 397 43 973 191 885 156 9 568 256 659 673 85 26 631 293 151 143 423
890
62
286 461 830 216 539 44 989 749 340 51 505 178 50 305 341 292 415 40 239 950 404 965 29 972 536 922 700 501 730 430 630 293 557 542 598 795 28 344 128 461 368 683 903 744 430 648 290 135 437 336 152 698 570 3 827 901 796 682 391 693 161 145
163
90
22 391 140 874 75 339 439 638 158 519 570 484 607 538 459 758 608 784 26 792 389 418 682 206 232 432 537 492 232 219 3 517 460 271 946 418 741 31 874 840 700 58 686 952 293 848 55 82 623 850 619 380 359 479 48 863 813 797 463 683 22 285 522 60 472 948 234 971 517 494 218 857 261 115 238 290 158 326 795 978 364 116 730 581 174 405 575 315 101 99
295
17
678 227 764 37 956 982 118 212 177 597 519 968 866 121 771 343 561
here the first number gives the number of test cases, here it is 10, and the number after it is the sum and the next line is the size of array followed by the array elements(numbers). here i need to sum the array numbers and match with the sum that is given. and print the position of the numbers that match the sum.
In the above case the results should be.
2, 3
1, 4
4, 5
29, 46
11, 56
4, 5
40, 46
16, 35
55, 74
7, 9
I've been trying the below code to write the output into a file, And also just to check i'm printing it even to the console. But here only the last case result is getting updated into file. please let me know where am i going wrong.
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.PrintWriter;
import java.util.ArrayList;
import java.util.Scanner;
public class FromFile {
public static void main(String args[]) throws Exception {
Scanner s = new Scanner(new File("D:/A1.txt"));
int testCaseCount = Integer.parseInt(s.next());
for (int i = 0; i < testCaseCount; i++){
int Avail = Integer.parseInt(s.next());
int size=Integer.parseInt(s.next());
ArrayList<Integer> list = new ArrayList<Integer>();
for(int j=0;j<size;j++)
{
list.add(s.nextInt());
}
for(int k=0;k<list.size()-1;k++){
for(int j=k+1; j<=list.size()-1;j++){
int sum=list.get(k)+list.get(j);
if(sum==Avail){
System.out.println((k+1)+", "+(j+1));
File file=new File("D:/A2.txt");
if(!file.exists()){
file.createNewFile();
}
FileWriter fw=new FileWriter(file.getAbsoluteFile());
BufferedWriter bw=new BufferedWriter(fw);
bw.write("Case:"+i+"-"+(k+1)+", "+(j+1)+" Available is "+Avail+" Values are "+list.get(k)+","+ list.get(j));
bw.close();
// System.out.println("Done");
}
}
}
}
s.close();
}
}
the line printed in the file is
Case:9-7, 9 Available is 295 Values are 118,177
it is only the last one; open the file only once not for each element.Opening a file with a FileWriter by default empties out the file
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.PrintWriter;
import java.util.ArrayList;
import java.util.Scanner;
public class FromFile {
public static void main(String args[]) throws Exception {
Scanner s = new Scanner(new File("D:/A1.txt"));
int testCaseCount = Integer.parseInt(s.next());
for (int i = 0; i < testCaseCount; i++){
int Avail = Integer.parseInt(s.next());
int size=Integer.parseInt(s.next());
ArrayList<Integer> list = new ArrayList<Integer>();
for(int j=0;j<size;j++)
{
list.add(s.nextInt());
}
FileWriter fw=new FileWriter(file.getAbsoluteFile());
BufferedWriter bw=new BufferedWriter(fw);
for(int k=0;k<list.size()-1;k++){
for(int j=k+1; j<=list.size()-1;j++){
int sum=list.get(k)+list.get(j);
if(sum==Avail){
System.out.println((k+1)+", "+(j+1));
File file=new File("D:/A2.txt");
if(!file.exists()){
file.createNewFile();
}
bw.write("Case:"+i+"-"+(k+1)+", "+(j+1)+" Available is "+Avail+" Values are "+list.get(k)+","+ list.get(j));
}
}
}
bw.close();
}
s.close();
}
}

Lucene Index Size

I have data like
1 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 20 22 23 24 25 26 28 30 36 37 39 40 41 46 48 49 51 52 53 54 55 56 58 60 66 67 68 71 72 74 77 78 85 89 90 91 108 109 110 116 117 118 120 121 123 137 138 145 146 147 148 154 157 159 162 165 166 168 175 179 181 198 201 203 212 215 216 223 231 233 254 266 270 274 323 327 329 331 347 352 355 360 363 370 411 415 434 438 442 444 445 462 470 471 477 486 495 499 503 524 525 536 542 595 603 608 636 644 646 647 670 692 694 698 762 763 798 809 822 970 981 987 992 1040 1057 1066 1079 1089 1111 1233 1244 1302 1315 1327 1333 1336 1387 1411 1412 1432 1458 1486 1498 1509 1572 1573 1574 1607 1625 1784 1808 1824 1909 1933 1938 1940 2011 2077 2081 2093 2286 2289 2395 2427 2467 2911 2944 2962 2975 3121 3170 3172 3197 3236 3267 3334 3699 3731 3905 3945 3982 3999 4008 4161 4234 4235 4296 4374 4457 4494 4526 4717 4720 4723 4820 4875 5352 5423 5472 5728 5799 5813 5821 6032 6230 6244 6278 6859 6868 7186 7280 7401 8734 8832 8885 8886 8925 9363 9510 9517 9592 9707 9802 10002 11097 11192 11715 11716 11836 11945 11996 12025 12482 12703 12706 12887 13122 13372 13482 13577 14150 14161 14169 14461 14626 16057 16268 16415 17183 17398 17440 17464 18097 18690 18731 18834 20576 20603 21558 21839 22202 26201 26497 26654 26658 26776 28088 28531 28551 28775 29122 29407.
This is one line of data many are there like that, stored in "training.txt". I am index it using following lucene indexing code
public class training
{
public static void main(String args[]) throws IOException, ParseException
{
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
// IndexWriter w = new IndexWriter(FSDirectory.open(new File("../search/index")), analyzer, true, new IndexWriter.MaxFieldLength(1000000));
IndexWriter w = new IndexWriter(FSDirectory.open(new File("index")), analyzer, true, new IndexWriter.MaxFieldLength(2139999999));
File file = new File("training.txt");
FileInputStream fis = null;
BufferedInputStream bis = null;
DataInputStream dis = null;
File file1 = new File("fileName.txt");
FileInputStream fis1 = null;
BufferedInputStream bis1 = null;
DataInputStream dis1 = null;
try {
fis = new FileInputStream(file);
// Here BufferedInputStream is added for fast reading.
bis = new BufferedInputStream(fis);
dis = new DataInputStream(bis);
fis1 = new FileInputStream(file1);
// Here BufferedInputStream is added for fast reading.
bis1 = new BufferedInputStream(fis1);
dis1 = new DataInputStream(bis1);
// dis.available() returns 0 if the file does not have more lines.
while (dis.available() != 0 && dis1.available() != 0 ) {
String tempImg=dis1.readLine();
String temp=dis.readLine();
addDoc(w,tempImg,temp);
// System.out.println(temp);
}
// dispose all the resources after using them.
fis.close();
bis.close();
dis.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
w.optimize();
w.close();
}
private static void addDoc(IndexWriter w, String value1,String value2) throws IOException
{
Document doc = new Document();
doc.add(new Field("fileId", value1, Field.Store.YES, Field.Index.ANALYZED));
doc.add(new Field("visualId", value2, Field.Store.YES, Field.Index.ANALYZED));
w.addDocument(doc);
}
}
There is another file "fileName.txt", for file name. My "training.txt" is of size 127.1 MB & index folder is getting created of size 217.2 MB. I believe it should get reduced.
My Search Code :
public class search
{
public static void main(String args[]) throws IOException, ParseException
{
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
String fname = "test.txt";
File file = new File(fname);
FileInputStream fis = null;
BufferedInputStream bis = null;
DataInputStream dis = null;
try {
fis = new FileInputStream(file);
Writer fos = null;
File outputFile = new File("outList.txt");
fos = new BufferedWriter(new FileWriter(outputFile));
// Here BufferedInputStream is added for fast reading.
bis = new BufferedInputStream(fis);
dis = new DataInputStream(bis);
while (dis.available() != 0)
{
Query q = new QueryParser(Version.LUCENE_CURRENT, "visualId", analyzer).parse(dis.readLine());
//3.search
int hitsPerPage = 200;
IndexSearcher searcher = new IndexSearcher(IndexReader.open(FSDirectory.open(new File("index")), true));
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
long startTime = System.currentTimeMillis();
searcher.search(q, collector);
long endTime = System.currentTimeMillis();
ScoreDoc[] hits = collector.topDocs().scoreDocs;
for(int i=0;i<hits.length;++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
String text = d.get("fileId");
fos.write(text);
fos.write("\n");
}
searcher.close();
}
// dispose all the resources after using them.
fis.close();
fos.close();
bis.close();
dis.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
//out.close();
}
}
My "test.txt" is having content:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 55 56 57 58 59 60 61 63 64 65 66 67 69 70 72 73 76 77 78 80 82 83 85 86 88 89 90 91 92 93 94 95 97 99 100 102 105 106 107 108 109 110 111 112 114 115 116 117 118 119 120 121 122 124 126 127 128 129 130 132 133 135 136 137 138 141 142 143 144 145 147 148 151 153 154 155 156 157 160 164 165 167 168 169 170 172 173 174 175 176 178 179 180 181 182 183 184 190 191 194 195 199 200 202 206 208 211 215 216 217 220 228 231 234 239 246 248 250 254 259 264 266 267 268 270 271 272 275 276 278 281 284 285 292 296 297 300 306 307 314 316 317 320 321 322 323 325 326 327 330 331 333 336 343 345 348 349 350 351 353 354 355 357 358 360 361 362 364 365 367 371 372 379 381 384 385 386 388 391 396 398 399 404 405 406 407 409 412 415 423 424 427 428 429 431 432 434 435 436 442 443 444 453 458 461 462 466 468 472 479 493 494 495 496 500 501 502 503 504 506 507 508 509 510 515 518 519 521 526 528 533 535 537 538 540 544 545 547 549 551 569 570 574 582 583 586 597 599 601 605 607 618 623 624 632 644 645 649 651 661 683 694 701 702 718 737 738 739 743 751 762 776 777 778 792 797 800 803 809 811 812 813 817 825 828 833 843 853 854 875 889 892 900 918 919 922 941 949 951 961 963 964 965 966 967 969 975 976 977 979 980 990 992 993 1000 1007 1008 1009 1029 1036 1045 1047 1051 1052 1053 1058 1059 1061 1062 1064 1065 1066 1070 1072 1075 1081 1082 1083 1086 1093 1094 1101 1114 1116 1117 1136 1143 1152 1154 1158 1159 1165 1172 1188 1194 1198 1212 1216 1218 1220 1227 1236 1245 1269 1272 1280 1283 1284 1285 1287 1293 1295 1296 1303 1305 1307 1327 1329 1332 1358 1373 1374 1375 1384 1385 1386 1397 1404 1415 1416 1436 1437 1478 1481 1482 1485 1487 1489 1501 1503 1505 1506 1508 1511 1517 1518 1520 1521 1522 1524 1525 1527 1529 1545 1555 1556 1564 1577 1579 1583 1599 1606 1610 1611 1612 1615 1620 1632 1636 1640 1648 1654 1706 1711 1721 1746 1750 1758 1792 1796 1802 1814 1820 1853 1869 1872 1897 1931 1932 1935 1946 1953 1982 2049 2082 2104 2107 2155 2211 2213 2216 2228 2253 2286 2329 2330 2332 2334 2377 2390 2399 2408 2427 2428 2433 2435 2440 2452 2475 2484 2498 2529 2559 2563 2626 2666 2675 2699 2754 2758 2765 2822 2847 2852 2882 2889 2893 2895 2898 2902 2906 2908 2925 2929 2932 2936 2939 2940 2971 2977 2980 2999 3022 3023 3024 3028 3086 3107 3134 3136 3140 3152 3156 3160 3174 3176 3182 3186 3192 3195 3197 3209 3216 3225 3242 3247 3249 3259 3279 3283 3303 3341 3349 3350 3352 3407 3429 3455 3462 3475 3476 3495 3515 3564 3581 3595 3637 3648 3653 3660 3681 3707 3735 3807 3817 3839 3850 3852 3856 3860 3878 3884 3889 3909 3916 3920 3980 3988 3997 4075 4120 4122 4123 4125 4152 4156 4157 4159 4191 4211 4244 4248 4307 4310 4434 4444 4446 4455 4462 4466 4503 4509 4516 4517 4525 4532 4551 4554 4559 4563 4564 4565 4573 4576 4581 4586 4634 4666 4669 4691 4730 4738 4748 4796 4817 4829 4832 4837 4846 4859 4896 4909 4919 4943 4962 5119 5132 5162 5237 5251 5275 5376 5387 5407 5441 5461 5559 5606 5608 5616 5692 5792 5797 5806 5837 5858 5947 6146 6245 6313 6320 6466 6632 6640 6648 6683 6759 6859 6987 6988 6989 6995 7003 7131 7171 7197 7223 7225 7280 7283 7299 7304 7320 7355 7357 7424 7451 7493 7586 7678 7690 7878 7997 8024 8096 8261 8275 8294 8465 8542 8556 8646 8667 8679 8685 8695 8707 8718 8724 8774 8786 8795 8808 8817 8819 8913 8932 8941 8996 9065 9069 9071 9085 9258 9321 9403 9408 9420 9456 9468 9481 9523 9528 9546 9559 9575 9584 9590 9592 9626 9648 9675 9727 9740 9742 9747 9776 9778 9836 9850 9909 10022 10046 10049 10056 10222 10288 10366 10385 10425 10429 10485 10546 10691 10744 10786 10912 10945 10958 10980 11043 11120 11205 11420 11451 11518 11551 11557 11568 11580 11633 11635 11652 11667 11728 11749 11760 11940 11963 11990 12225 12360 12367 12370 12375 12455 12468 12472 12476 12573 12632 12633 12731 12732 12745 12921 12922 12931 13303 13331 13332 13338 13364 13366 13386 13397 13510 13528 13548 13551 13575 13597 13654 13662 13676 13688 13689 13690 13693 13694 13720 13728 13743 13757 13901 13999 14007 14074 14190 14214 14245 14389 14452 14487 14496 14511 14538 14578 14689 14726 14756 14829 14887 15357 15395 15485 15710 15754 15824 16128 16161 16220 16323 16384 16678 16819 16825 16848 17075 17375 17391 17417 17511 17575 17841 18439 18734 18940 18961 19399 19896 19920 19945 20050 20276 20578 20960 20964 20967 20986 21009 21393 21513 21591 21670 21676 21839 21849 21898 21911 21960 22066 22072 22271 22354 22480 22759 23033 23070 23635 23990 24073 24287 24784 24824 24882 25395 25625 25668 25938 26002 26036 26054 26056 26085 26122 26153 26173 26321 26358 26385 26423 26450 26456 26739 26796 26823 26987 27196 27206 27214 27255 27773 27962 28209 28225 28260 28369 28405 28443 28568 28585 28637 28676 28724 28753 28770 28775 28877 28944 29026 29180 29221 29225 29240 29327 29333 29507
Thanks,
Ravi.
When you are adding Field.Store.YES to a Lucene field it is stored as well as indexed. The result would be that your index becomes larger than expected.

Categories