How to implement Autocomplete using Trie with a HashMap?

How to implement Autocomplete using Trie with a HashMap? - java

Here below is the code with HashMap implementation of Trie. But I am not sure how to implement the autocomplete part. I see how people have used LinkedList to implement Trie, but I want to understand with HashMap. Any help appreciated. I have pasted the code below for my Trie.
Is there a way to look for a prefix, then go to the end of the prefix and look for its children and return them back as strings? And if so, how to achieve using HashMap implementation. Or shouldn't I even do this with HashMap and go for LinkedList. And I am not sure, why one is better than the other?
public class TrieNode {
Map<Character, TrieNode> children;
boolean isEndOfWord;
public TrieNode() {
isEndOfWord = false;
children = new HashMap<>();
}
}
public class TrieImpl {
private TrieNode root;
public TrieImpl() {
root = new TrieNode();
}
// iterative insertion into Trie Data Structure
public void insert(String word) {
if (searchTrie(word))
return;
TrieNode current = root;
for(int i=0; i<word.length(); i++) {
char ch = word.charAt(i);
TrieNode node = current.children.get(ch);
if(node == null) {
node = new TrieNode();
current.children.put(ch, node);
}
current = node;
}
current.isEndOfWord = true;
}
// search iteratively
public boolean searchTrie(String word) {
TrieNode current = root;
for(int i=0; i < word.length(); i++) {
char ch = word.charAt(i);
TrieNode node = current.children.get(ch);
if(node == null) {
return false;
}
current = node;
}
return current.isEndOfWord;
}
// delete a word recursively
private boolean deleteRecursive(TrieNode current, String word, int index) {
if(index == word.length()) {
if(!current.isEndOfWord) {
return false;
}
current.isEndOfWord = false;
return current.children.size() == 0;
}
char ch = word.charAt(index);
TrieNode node = current.children.get(ch);
if(node == null) {
return false;
}
boolean shouldDeleteCurrentNode = deleteRecursive(node, word, index+1);
if(shouldDeleteCurrentNode) {
current.children.remove(ch);
return current.children.size() == 0;
}
return false;
}
// calling the deleteRecursively function
public boolean deleteRecursive(String word) {
return deleteRecursive(root, word, 0);
}
public static void main(String[] args) {
TrieImpl obj = new TrieImpl();
obj.insert("amazon");
obj.insert("amazon prime");
obj.insert("amazing");
obj.insert("amazing spider man");
obj.insert("amazed");
obj.insert("alibaba");
obj.insert("ali express");
obj.insert("ebay");
obj.insert("walmart");
boolean isExists = obj.searchTrie("amazing spider man");
System.out.println(isExists);
}
}

I was in hurry finding some other solution here, but this is
interesting question.
Answering this-
Is there a way to look for a prefix, then go to the end of the prefix
and look for its children and return them back as strings?
Yes, why not, if you have prefix ama ,now go to your searchTrie method, and when you are done and out of the loop. then, you have current variable pointing to a(last character from ama)
you can then write a method as below -
public List<String> getPrefixStrings(TrieNode current){
// DO DFS here and put all character with isEndOfWord = true in the list
// keep on recursing to this same method and adding to the list
// then return the list
}

Related

How to find the longest word in a prefix tree recursiverly?

I have the following data structure:
This tree stores only characters in lowercase.
I'm trying to build a method that finds the longest word in the tree recursively.
I have difficulty to build this method that checks each branch of the nodes recursively.
Here the given classes I'm using, showing only the relevant methods:
public class Tree {
private final Node root;
public Tree() {
root = new Node('0');
}
private String getWordOfBranch(final Node[] nodes, final int i) {
if (nodes[i] == null) {
return "";
}
if (nodes[i].isLeaf()) {
return String.valueOf(nodes[i].getValue());
}
return nodes[i].getValue() + getWordOfBranch(nodes[i].children, i);
}
public class Node {
private final char value;
protected Node[] children;
public Node(final char value) {
this.value = value;
children = new Node[26];
}
public boolean isLeaf() {
for (final Node child : children) {
if (child != null) {
return false;
}
}
return true;
}
public char getValue() {
return value;
}

Well, in this case, you are only taking the word starting at a specific position i. What you should be doing is looping through all of the children and finding the longest word out of all of the children. Also, your node class should not be having a set amount of children, but instead a dynamically sized list of children, using something like an ArrayList to store the children, since each node does not have to have a specific set of children.
public class Node {
private final char value;
protected ArrayList<Node> children;
public Node(final char value) {
this.value = value;
children = new ArrayList<Node>();
}
public boolean isLeaf() {
for (final Node child : children) {
if (child != null) {
return false;
}
}
return true;
}
public char getValue() {
return value;
}
public ArrayList<Node> getChildren() {
return children;
}
public String getLargestWord(Node root) {
if (root.isLeaf()) {
return String.valueOf(root.getValue());
}
else {
String longest = "";
for (Node child : root.getChildren()) {
String longWordInChild = getLongestWord(child);
if (longWordInChild.length() > longest.length()) {
longest = longWordInChild;
}
}
return root.getValue() + longest;
}
}

I made some changes to your code.
First the Node class.
import java.util.ArrayList;
import java.util.List;
public class Node {
private final char value;
protected List<Node> children;
public Node(char letter) {
value = letter;
children = new ArrayList<>();
}
private static boolean isValidValue(Node node) {
boolean isValid = false;
if (node != null) {
char ch = node.getValue();
isValid = 'a' <= ch && ch <= 'z';
}
return isValid;
}
public boolean addChild(Node child) {
boolean added = false;
if (child != null) {
if (isValidValue(child)) {
boolean found = false;
for (Node kid : children) {
found = kid != null && kid.getValue() == child.getValue();
if (found) {
break;
}
}
if (!found) {
added = children.add(child);
}
}
}
return added;
}
public List<Node> getChildren() {
return children;
}
public char getValue() {
return value;
}
}
I used List for the children, rather than an array, because an array has a fixed size and a List does not.
Now the Tree class. Note that I added a main() method to the class just for testing purposes. The main() method creates the tree structure in the image in your question.
A tree data structure has levels and also has leaves. A leaf is a node in the tree that has no child nodes. Hence every leaf in your tree is the last letter of a word. The leaves at the highest level represent the longest words. (Note that the level of the root node in the tree is zero.)
import java.util.ArrayList;
import java.util.List;
public class Tree {
private int longest;
private List<String> words;
private Node root = new Node('\u0000');
public List<String> getWords() {
return words;
}
public Node getRoot() {
return root;
}
public void visit() {
visit(root, 0, new StringBuilder());
}
public void visit(Node node, int level, StringBuilder word) {
if (node != null) {
word.append(node.getValue());
List<Node> children = node.getChildren();
if (children.size() == 0) {
if (level > longest) {
longest = level;
words = new ArrayList<>();
}
if (level == longest) {
words.add(word.toString());
}
}
else {
for (Node child : children) {
word.delete(level, word.length());
visit(child, level + 1, word);
}
}
}
}
/**
* For testing only.
*/
public static void main(String[] args) {
Tree tree = new Tree();
Node root = tree.getRoot();
Node j = new Node('j');
root.addChild(j);
Node r = new Node('r');
root.addChild(r);
Node a = new Node('a');
j.addChild(a);
Node v = new Node('v');
a.addChild(v);
Node a2 = new Node('a');
v.addChild(a2);
Node a3 = new Node('a');
r.addChild(a3);
Node o = new Node('o');
r.addChild(o);
Node d = new Node('d');
a3.addChild(d);
Node n = new Node('n');
a3.addChild(n);
Node d2 = new Node('d');
n.addChild(d2);
Node u = new Node('u');
a3.addChild(u);
Node m = new Node('m');
u.addChild(m);
Node s = new Node('s');
o.addChild(s);
Node e = new Node('e');
s.addChild(e);
tree.visit();
System.out.println(tree.getWords());
}
}
Method visit(Node, int, StringBuilder) is the recursive method. It traverses every path in the tree and appends the characters in each node to a StringBuilder. Hence the StringBuilder contains the word obtained by traversing a single path in the tree - from the root to the leaf.
I also keep track of the node level since the highest level means the longest word.
Finally I store all the longest words in another List.
Running the above code produces the following output:
[java, rand, raum, rose]

Trie Recursion Strange Output

I've been working on some code that recursively iterates through a trie filled with words.
There are a lot of problems with this right now, but ignoring those, I was wondering why "?"s are printing out here when I have no "?"s in my trie or any print statements?
Here's my code for the recursion portion of my work. Please ask if you need anything else.
public String recurse(Node n){//RECURSION
String build = "";
build += n.getVal();
System.out.print(build);
if(n.getChild() != null){
recurse(n.getChild());
}
if(n.getSibling() != null){
recurse(n.getSibling());
}
return build;
}
This is the output I'm currently getting:
duck^?free^?good^?real^?hum^?rtful^?duck^?free^?good^?real^?hum^?rtful^?
Any help is appreciated. Thanks a lot.
EDIT
Here are the words in my trie(i'm using a small number of words to test first):
argument
bash
cow
duck
free
good
real
hum
ask
allow
hurtful
Here is my Node class:
public class Node{
private Node child;
private Node sibling;
private char value;
public Node(char val){
value = val;
}
public char getVal(){
return value;
}
public Node getChild(){
return child;
}
public Node getSibling(){
return sibling;
}
public void setVal(char val){
value = val;
}
public void setChild(Node nextReference){
child = nextReference;
}
public void setSibling(Node nextReference){
sibling = nextReference;
}
}
I filled the DLB trie with the code like this:
public boolean add(String s){
if (s == null)
return false;
s = s + SENTINEL; //sentinel is '^'
StringCharacterIterator iterator = new StringCharacterIterator(s);
if(root == null){//this is if there are NO values in the trie
root = new Node(iterator.current());
Node currentNode = root;
iterator.next();
while(iterator.getIndex() < iterator.getEndIndex()){
Node newNode = new Node(iterator.current());
currentNode.setChild(newNode);
currentNode = currentNode.getChild();
iterator.next();
}
}else{
Node currentNode = root;
while(iterator.getIndex() < iterator.getEndIndex()){
while(iterator.current()!=currentNode.getVal()){
if(currentNode.getSibling() == null){
Node newNode = new Node(iterator.current());
currentNode.setSibling(newNode);
currentNode = currentNode.getSibling();
break;
}else{
currentNode = currentNode.getSibling();
}
}
iterator.next();
if(currentNode.getChild() == null){
Node newNode = new Node(iterator.current());
currentNode.setChild(newNode);
}
//iterator.next();
currentNode = currentNode.getChild();
}
}
return true;
}

AutoComplete using a Trie in Java

I am working on this assignment which implements Autocomplete and dictionary. I have sucessfully implemented spellcheck and the addWord() and isWord() functions.
But I am just not able to implement the function which predicts words for AutoCompletions.
package spelling;
import java.util.List;
import java.util.Queue;
import java.util.Set;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.LinkedList;
/**
* An trie data structure that implements the Dictionary and the AutoComplete ADT
* #author You
*
*/
public class AutoCompleteDictionaryTrie implements Dictionary, AutoComplete {
private TrieNode root;
private int size;
public AutoCompleteDictionaryTrie()
{
root = new TrieNode();
size=0;
}
/** Insert a word into the trie.
* For the basic part of the assignment (part 2), you should ignore the word's case.
* That is, you should convert the string to all lower case as you insert it. */
public boolean addWord(String word)
{
//TODO: Implement this method.
String Word=word.toLowerCase();
if(isWord(Word))
return false;
HashMap<Character, TrieNode> children=root.children;
for(int i=0; i<Word.length(); i++){
char c = Word.charAt(i);
TrieNode t;
if(children.containsKey(c)){
t = children.get(c);
}else{
t = new TrieNode(""+(c));
children.put(c, t);
}
children = t.children;
if(i==Word.length()-1)
{
t.isWord = true;
size++;
}
}
return true;
}
/**
* Return the number of words in the dictionary. This is NOT necessarily the same
* as the number of TrieNodes in the trie.
*/
public int size()
{
//TODO: Implement this method
return size;
}
/** Returns whether the string is a word in the trie */
#Override
public boolean isWord(String s)
{
// TODO: Implement this method
TrieNode t = searchNode(s.toLowerCase());
if(t != null && t.isWord)
return true;
else
return false;
}
public TrieNode searchNode(String str){
HashMap<Character, TrieNode> children = root.children;
TrieNode t = null;
for(int i=0; i<str.length(); i++){
char c = str.charAt(i);
if(children.containsKey(c)){
t = children.get(c);
children = t.children;
}else{
return null;
}
}
return t;
}
/**
* * Returns up to the n "best" predictions, including the word itself,
* in terms of length
* If this string is not in the trie, it returns null.
* #param text The text to use at the word stem
* #param n The maximum number of predictions desired.
* #return A list containing the up to n best predictions
*/#Override
public List<String> predictCompletions(String prefix, int numCompletions)
{
// TODO: Implement this method
// This method should implement the following algorithm:
// 1. Find the stem in the trie. If the stem does not appear in the trie, return an
// empty list
// 2. Once the stem is found, perform a breadth first search to generate completions
// using the following algorithm:
// Create a queue (LinkedList) and add the node that completes the stem to the back
// of the list.
// Create a list of completions to return (initially empty)
// While the queue is not empty and you don't have enough completions:
// remove the first Node from the queue
// If it is a word, add it to the completions list
// Add all of its child nodes to the back of the queue
// Return the list of completions
List<String> completions=null;
int counter=0;
if (prefix==null){
return Collections.emptyList();
}
prefix=prefix.toLowerCase();
if(isWord(prefix))
completions.add(prefix);
LinkedList nodes = new LinkedList();
TrieNode curr=searchNode(prefix);
nodes.addLast(curr);
while(!nodes.isEmpty() && counter!=numCompletions)
{
if((nodes.removeFirst()).isWord)
completions.add(curr.getText());
TrieNode next = null;
for (Character c : curr.getValidNextCharacters()) {
next = curr.getChild(c);
}
}
return Collections.emptyList();
}
public void checkNull(String word){
if (word==null)
throw new NullPointerException("Null word passed");
}
// For debugging
public void printTree()
{
printNode(root);
}
/** Do a pre-order traversal from this node down */
public void printNode(TrieNode curr)
{
if (curr == null)
return;
System.out.println(curr.getText());
TrieNode next = null;
for (Character c : curr.getValidNextCharacters()) {
next = curr.getChild(c);
printNode(next);
}
}
}
And this is the code of the TrieNode class:
package spelling;
import java.util.HashMap;
import java.util.Set;
/**
* Represents a node in a Trie
* #author UC San Diego Intermediate Programming MOOC Team
*
*/
class TrieNode {
HashMap<Character, TrieNode> children;
private String text; // Maybe omit for space
boolean isWord;
/** Create a new TrieNode */
public TrieNode()
{
children = new HashMap<Character, TrieNode>();
text = "";
isWord = false;
}
/** Create a new TrieNode given a text String to store in it */
public TrieNode(String text)
{
this();
this.text = text;
}
/** Return the TrieNode that is the child when you follow the
* link from the given Character
* #param c The next character in the key
* #return The TrieNode that character links to, or null if that link
* is not in the trie.
*/
public TrieNode getChild(Character c)
{
return children.get(c);
}
/** Inserts this character at this node.
* Returns the newly created node, if c wasn't already
* in the trie. If it was, it does not modify the trie
* and returns null.
* #param c The character that will link to the new node
* #return The newly created TrieNode, or null if the node is already
* in the trie.
*/
public TrieNode insert(Character c)
{
if (children.containsKey(c)) {
return null;
}
TrieNode next = new TrieNode(text + c.toString());
children.put(c, next);
return next;
}
/** Return the text string at this node */
public String getText()
{
return text;
}
/** Set whether or not this node ends a word in the trie. */
public void setEndsWord(boolean b)
{
isWord = b;
}
/** Return whether or not this node ends a word in the trie. */
public boolean endsWord()
{
return isWord;
}
/** Return the set of characters that have links from this node */
public Set<Character> getValidNextCharacters()
{
return children.keySet();
}
}
Even though the algorithm is there I am not able to implement it. Any kind of help would be greatly appreciated.

are you trying to solve this as part of the Coursera's university of San Diego course?
If so then all what you have to do is to follow the algorithm that was written as a comment inside the class.
Any way, I added here a copy of my implementation to this method. Just don't copy and paste it as part of your solution please. Use it as guidance. I added comments in the code to help you understanding my algorithm:
//Trying to find the stem in Trie
String prefixToCheckLowerCase = prefix.toLowerCase();
int completionsCount = 0;
List<String> completions = new LinkedList<String>();
TrieNode traversal = root;
for (int i = 0; i < prefixToCheckLowerCase.length(); i++)
{
if (traversal.getValidNextCharacters().contains(prefixToCheckLowerCase.charAt(i)))
{
traversal = traversal.getChild(prefixToCheckLowerCase.charAt(i));
}
//Means stem not found, returns an empty list
else
return completions;
}
//If current word is an end word, increment the counter and add it to compeltions list
if (traversal.endsWord())
{
completionsCount=1;
completions.add(traversal.getText());
}
List<TrieNode> nodesToBeSearched = new LinkedList<TrieNode>();
List<Character> ChildCharaterList = new LinkedList<Character>(traversal.getValidNextCharacters());
//Filling the list with children of the current node, first level of of the breadth first search
for (int i=0; i<ChildCharaterList.size(); i++)
{
nodesToBeSearched.add(traversal.getChild(ChildCharaterList.get(i)));
}
//while loop for the linked list elements and see if any compeltions exists , inside it we will also check each node children and add them to the list!!!
while (nodesToBeSearched!=null && nodesToBeSearched.size()>0 && completionsCount < numCompletions)
{
TrieNode trieNode = nodesToBeSearched.remove(0);
if (trieNode.endsWord())
{
completionsCount++;
completions.add(trieNode.getText());
}
List<Character> subTrieNodeCholdren = new LinkedList<Character>(trieNode.getValidNextCharacters());
//Adding all next level tries to the linked list , kinda recursive!!!
for (int i=0; i<subTrieNodeCholdren.size();i++)
{
nodesToBeSearched.add(trieNode.getChild(subTrieNodeCholdren.get(i)));
}
}
return completions;

import java.util.ArrayList;
class TrieNode{
char data;
boolean isTerminating;
TrieNode children[];
int childCount;
public TrieNode(char data) {
this.data = data;
isTerminating = false;
children = new TrieNode[26];
childCount = 0;
}
}
public class Trie {
private TrieNode root;
//ArrayList<String> ans=new ArrayList<>();
public Trie() {
root = new TrieNode('\0');
}
private void add(TrieNode root, String word){
if(word.length() == 0){
root.isTerminating = true;
return;
}
int childIndex = word.charAt(0) - 'a';
TrieNode child = root.children[childIndex];
if(child == null){
child = new TrieNode(word.charAt(0));
root.children[childIndex] = child;
root.childCount++;
}
add(child, word.substring(1));
}
public void add(String word){
add(root, word);
}
private void searchHelper(TrieNode root,String word,String ans)
{
try
{
if(word.length()==0)
{
if(root.isTerminating == true)
{
System.out.println(ans);
}
for(int i=0;i<26;i++)
{
TrieNode temp=root.children[i];
if(temp !=null)
{
//ans=ans+temp.data;
//System.out.println("test check "+ans );
searchHelper(temp,word,ans+temp.data);
}
}
}
int childIndex=word.charAt(0)-'a';
TrieNode child=root.children[childIndex];
if(child == null)
{
//System.out.print();
return ;
}
ans=ans+word.charAt(0);
searchHelper(child,word.substring(1),ans);
}
catch(Exception e)
{
//System.out.println("error");
}
}
public void search(String word)
{
String s="";
searchHelper(root,word,s);
}
public void autoComplete(ArrayList<String> input, String word) {
// Complete this function
// Print the output as specified in question
Trie ansTrie = new Trie();
for(int i=0;i<input.size();i++)
{
ansTrie.add(input.get(i));
}
ansTrie.search(word);
}
}
i hope it helps in solving you doubt.
i am already sorry for any indentation errors .

import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Queue;
import java.util.Set;
import java.util.stream.Collectors;
public class TrieImpl {
static class Element {
private Trie trie;
private String word;
Element(Trie trie, String word) {
this.trie = trie;
this.word = word;
}
}
static class Trie {
private boolean isLeaf;
private Map<Character, Trie> children;
private Map<Character, Integer> character;
Trie() {
isLeaf = false;
children = new HashMap<>();
character = new HashMap<>();
}
public void insert(String word) {
Trie curr = this;
for (Character ch : word.toCharArray()) {
curr.children.putIfAbsent(ch, new Trie());
int count = (curr.character.get(ch) == null) ? 1 : curr.character.get(ch) + 1;
curr.character.put(ch, count);
curr = curr.children.get(ch);
}
curr.isLeaf = true;
}
public boolean search(String word) {
Trie curr = this;
for (Character ch : word.toCharArray()) {
if (curr.children.get(ch) == null)
return false;
curr = curr.children.get(ch);
}
return curr.isLeaf;
}
public void delete(String word) {
if (search(word)) {
Trie lastSecond = this;
Character charToRemove = word.charAt(0);
Trie curr = this;
int i = -1;
while (i < word.length() && curr != null) {
if (curr.isLeaf && i != word.length() - 1) {
charToRemove = word.charAt(i + 1);
lastSecond = curr;
}
i = i + 1;
if (i < word.length())
curr = curr.children.get(word.charAt(i));
}
lastSecond.children.remove(charToRemove);
}
}
public int findPrefixCount(String word) {
Trie curr = this;
Character lastChar = null;
int count = 0;
for (Character ch : word.toCharArray()) {
if (curr.children.get(ch) == null)
return 0;
if (count < word.length() - 1) {
curr = curr.children.get(ch);
count++;
}
lastChar = ch;
}
if (lastChar != null && curr.character.get(lastChar) != null)
return curr.character.get(lastChar);
else
return 0;
}
public Set<String> autoComplete(String word) {
Trie curr = this;
int count = 0;
String wo = "";
Queue<Element> queue = new LinkedList<>();
Set<String> set = new HashSet<>();
for (Character ch : word.toCharArray()) {
if (count < word.length()) {
curr = curr.children.get(ch);
count++;
wo += ch;
}
}
if (curr != null)
queue.add(new Element(curr, wo));
while (!queue.isEmpty()) {
Element elem = queue.poll();
Trie current = elem.trie;
String temp = elem.word;
if (current != null && current.isLeaf)
set.add(temp);
List<Character> keys = current.character.keySet().stream().collect(Collectors.toList());
for (int i = 0; i < current.children.size(); i++) {
queue.add(new Element(current.children.get(keys.get(i)), temp + keys.get(i)));
}
}
return set;
}
}
public static void main(String[] args) {
Trie head = new Trie();
head.insert("techie");
head.insert("techi");
head.insert("tech");
head.insert("tecabc");
head.insert("tecabk");
head.insert("tecabd");
head.insert("tecalmz");
Set<String> words = head.autoComplete("t");
words.stream().forEach(x -> System.out.println(x));
}
}

Get Words out of a Trie Data Structure

i have the following Trie Data Structure:
public class CDictionary implements IDictionary {
private static final int N = 'z' -'a'+1;
private static class Node {
private boolean end = false;
private Node[] next = new Node[N];
}
private int size = 0;
private Node root = new Node();
#Override
public boolean contains(String word) {
Node node = this.contains(root,word,0);
if (node == null) {
return false;
}
return node.end;
}
private Node contains(Node node, String str, int d) {
if (node == null) return null;
if (d == str.length()) return node;
char c = str.charAt(d);
return contains(node.next[c-'a'], str, d+1);
}
#Override
public void insert(String word) {
this.root = insert(this.root, word, 0);
this.size++;
}
private Node insert(Node node, String str, int d) {
if (node == null) node = new Node();
if (d == str.length()) {
node.end = true;
return node;
}
char c = str.charAt(d);
node.next[c-'a'] = this.insert(node.next[c-'a'], str, d+1);
return node;
}
#Override
public int size() {
return size;
}
The Trie is filled with some words like
for, the, each, home, is, it, egg, red...
Now i need a function to get all Words with a specific length for example the length 3
public List<String> getWords(int lenght) {
}
With the Words mentioned above it should return a list with the words
for,the,egg,red
The Problem is how can i restore these words out of the Trie Structur?

You need to recurse through your structure to a maximum depth of N (in this case 3)
You could do this by adding a couple of methods to your dictionary...
public List<String> findWordsOfLength(int length) {
// Create new empty list for results
List<String> results = new ArrayList<>();
// Start at the root node (level 0)...
findWordsOfLength(root, "", 0, length, results);
// Return the results
return results;
}
public void findWordsOfLength(Node node, String wordSoFar, int depth, int maxDepth, List<String> results) {
// Go through each "child" of this node
for(int k = 0; k < node.next.length; k++) {
Node child = node.next[k];
// If this child exists...
if(child != null) {
// Work out the letter that this child represents
char letter = 'a' + k;
// If we have reached "maxDepth" letters...
if(depth == maxDepth) {
// Add this letter to the end of the word so far and then add the word to the results list
results.add(wordSoFar + letter);
} else {
// Otherwise recurse to the next level
findWordsOfLength(child, wordSoDar + letter, depth + 1, maxDepth, results);
}
}
}
}
(I have not compiled / tested this, but it should give you an idea of what you need to do)
Hope this helps.

Getting a list of words from a Trie

I'm looking to use the following code to not check whether there is a word matching in the Trie but to return a list all words beginning with the prefix inputted by the user. Can someone point me in the right direction? I can't get it working at all.....
public boolean search(String s)
{
Node current = root;
System.out.println("\nSearching for string: "+s);
while(current != null)
{
for(int i=0;i<s.length();i++)
{
if(current.child[(int)(s.charAt(i)-'a')] == null)
{
System.out.println("Cannot find string: "+s);
return false;
}
else
{
current = current.child[(int)(s.charAt(i)-'a')];
System.out.println("Found character: "+ current.content);
}
}
// If we are here, the string exists.
// But to ensure unwanted substrings are not found:
if (current.marker == true)
{
System.out.println("Found string: "+s);
return true;
}
else
{
System.out.println("Cannot find string: "+s +"(only present as a substring)");
return false;
}
}
return false;
}
}

I faced this problem while trying to make a text auto-complete module. I solved the problem by making a Trie in which each node contains it's parent node as well as children. First I searched for the node starting at the input prefix. Then I applied a Traversal on the Trie that explores all the nodes of the sub-tree with it's root as the prefix node. whenever a leaf node is encountered, it means that the end of a word starting from input prefix has been found. Starting from that leaf node I iterate through the parent nodes getting parent of parent, and reach the root of the subtree. While doing so I kept adding the keys of nodes in a stack. In the end I took the prefix and started appended it by popping the stack. I kept on saving the words in an ArrayList. At the end of the traversal I get all the words starting from the input prefix. Here is the code with usage example:
class TrieNode
{
char c;
TrieNode parent;
HashMap<Character, TrieNode> children = new HashMap<Character, TrieNode>();
boolean isLeaf;
public TrieNode() {}
public TrieNode(char c){this.c = c;}
}
-
public class Trie
{
private TrieNode root;
ArrayList<String> words;
TrieNode prefixRoot;
String curPrefix;
public Trie()
{
root = new TrieNode();
words = new ArrayList<String>();
}
// Inserts a word into the trie.
public void insert(String word)
{
HashMap<Character, TrieNode> children = root.children;
TrieNode crntparent;
crntparent = root;
//cur children parent = root
for(int i=0; i<word.length(); i++)
{
char c = word.charAt(i);
TrieNode t;
if(children.containsKey(c)){ t = children.get(c);}
else
{
t = new TrieNode(c);
t.parent = crntparent;
children.put(c, t);
}
children = t.children;
crntparent = t;
//set leaf node
if(i==word.length()-1)
t.isLeaf = true;
}
}
// Returns if the word is in the trie.
public boolean search(String word)
{
TrieNode t = searchNode(word);
if(t != null && t.isLeaf){return true;}
else{return false;}
}
// Returns if there is any word in the trie
// that starts with the given prefix.
public boolean startsWith(String prefix)
{
if(searchNode(prefix) == null) {return false;}
else{return true;}
}
public TrieNode searchNode(String str)
{
Map<Character, TrieNode> children = root.children;
TrieNode t = null;
for(int i=0; i<str.length(); i++)
{
char c = str.charAt(i);
if(children.containsKey(c))
{
t = children.get(c);
children = t.children;
}
else{return null;}
}
prefixRoot = t;
curPrefix = str;
words.clear();
return t;
}
///////////////////////////
void wordsFinderTraversal(TrieNode node, int offset)
{
// print(node, offset);
if(node.isLeaf==true)
{
//println("leaf node found");
TrieNode altair;
altair = node;
Stack<String> hstack = new Stack<String>();
while(altair != prefixRoot)
{
//println(altair.c);
hstack.push( Character.toString(altair.c) );
altair = altair.parent;
}
String wrd = curPrefix;
while(hstack.empty()==false)
{
wrd = wrd + hstack.pop();
}
//println(wrd);
words.add(wrd);
}
Set<Character> kset = node.children.keySet();
//println(node.c); println(node.isLeaf);println(kset);
Iterator itr = kset.iterator();
ArrayList<Character> aloc = new ArrayList<Character>();
while(itr.hasNext())
{
Character ch = (Character)itr.next();
aloc.add(ch);
//println(ch);
}
// here you can play with the order of the children
for( int i=0;i<aloc.size();i++)
{
wordsFinderTraversal(node.children.get(aloc.get(i)), offset + 2);
}
}
void displayFoundWords()
{
println("_______________");
for(int i=0;i<words.size();i++)
{
println(words.get(i));
}
println("________________");
}
}//
Example
Trie prefixTree;
prefixTree = new Trie();
prefixTree.insert("GOING");
prefixTree.insert("GONG");
prefixTree.insert("PAKISTAN");
prefixTree.insert("SHANGHAI");
prefixTree.insert("GONDAL");
prefixTree.insert("GODAY");
prefixTree.insert("GODZILLA");
if( prefixTree.startsWith("GO")==true)
{
TrieNode tn = prefixTree.searchNode("GO");
prefixTree.wordsFinderTraversal(tn,0);
prefixTree.displayFoundWords();
}
if( prefixTree.startsWith("GOD")==true)
{
TrieNode tn = prefixTree.searchNode("GOD");
prefixTree.wordsFinderTraversal(tn,0);
prefixTree.displayFoundWords();
}

After building Trie, you could do DFS starting from node, where you found prefix:
Here Node is Trie node, word=till now found word, res = list of words
def dfs(self, node, word, res):
# Base condition: when at leaf node, add current word into our list
if EndofWord at node:
res.append(word)
return
# For each level, go deep down, but DFS fashion
# add current char into our current word.
for w in node:
self.dfs(node[w], word + w, res)

The simplest solution is to use a depth-first search.
You go down the trie, matching letter by letter from the input. Then, once you have no more letter to match, everything under that node are strings that you want. Recursively explore that whole subtrie, building the string as you go down to its nodes.

This is easier to solve recursively in my opinion. It would go something like this:
Write a recursive function Print that prints all the nodes in the trie rooted in the node you give as parameter. Wiki tells you how to do this (look at sorting).
Find the last character of your prefix, and the node that is labeled with the character, going down from the root in your trie. Call the Print function with this node as the parameter. Then just make sure you also output the prefix before each word, since this will give you all the words without their prefix.
If you don't really care about efficiency, you can just run Print with the main root node and only print those words that start with the prefix you're interested in. This is easier to implement but slower.

You need to traverse the sub-tree starting at the node you found for the prefix.
Start in the same way, i.e. finding the correct node. Then, instead of checking its marker, traverse that tree (i.e. go over all its descendants; a DFS is a good way to do it) , saving the substring used to reach the "current" node from the first node.
If the current node is marked as a word, output* the prefix + substring reached.
* or add it to a list or something.

I built a trie once for one of ITA puzzles
public class WordTree {
class Node {
private final char ch;
/**
* Flag indicates that this node is the end of the string.
*/
private boolean end;
private LinkedList<Node> children;
public Node(char ch) {
this.ch = ch;
}
public void addChild(Node node) {
if (children == null) {
children = new LinkedList<Node>();
}
children.add(node);
}
public Node getNode(char ch) {
if (children == null) {
return null;
}
for (Node child : children) {
if (child.getChar() == ch) {
return child;
}
}
return null;
}
public char getChar() {
return ch;
}
public List<Node> getChildren() {
if (this.children == null) {
return Collections.emptyList();
}
return children;
}
public boolean isEnd() {
return end;
}
public void setEnd(boolean end) {
this.end = end;
}
}
Node root = new Node(' ');
public WordTree() {
}
/**
* Searches for a strings that match the prefix.
*
* #param prefix - prefix
* #return - list of strings that match the prefix, or empty list of no matches are found.
*/
public List<String> getWordsForPrefix(String prefix) {
if (prefix.length() == 0) {
return Collections.emptyList();
}
Node node = getNodeForPrefix(root, prefix);
if (node == null) {
return Collections.emptyList();
}
List<LinkedList<Character>> chars = collectChars(node);
List<String> words = new ArrayList<String>(chars.size());
for (LinkedList<Character> charList : chars) {
words.add(combine(prefix.substring(0, prefix.length() - 1), charList));
}
return words;
}
private String combine(String prefix, List<Character> charList) {
StringBuilder sb = new StringBuilder(prefix);
for (Character character : charList) {
sb.append(character);
}
return sb.toString();
}
private Node getNodeForPrefix(Node node, String prefix) {
if (prefix.length() == 0) {
return node;
}
Node next = node.getNode(prefix.charAt(0));
if (next == null) {
return null;
}
return getNodeForPrefix(next, prefix.substring(1, prefix.length()));
}
private List<LinkedList<Character>> collectChars(Node node) {
List<LinkedList<Character>> chars = new ArrayList<LinkedList<Character>>();
if (node.getChildren().size() == 0) {
chars.add(new LinkedList<Character>(Collections.singletonList(node.getChar())));
} else {
if (node.isEnd()) {
chars.add(new LinkedList<Character>
Collections.singletonList(node.getChar())));
}
List<Node> children = node.getChildren();
for (Node child : children) {
List<LinkedList<Character>> childList = collectChars(child);
for (LinkedList<Character> characters : childList) {
characters.push(node.getChar());
chars.add(characters);
}
}
}
return chars;
}
public void addWord(String word) {
addWord(root, word);
}
private void addWord(Node parent, String word) {
if (word.trim().length() == 0) {
return;
}
Node child = parent.getNode(word.charAt(0));
if (child == null) {
child = new Node(word.charAt(0));
parent.addChild(child);
} if (word.length() == 1) {
child.setEnd(true);
} else {
addWord(child, word.substring(1, word.length()));
}
}
public static void main(String[] args) {
WordTree tree = new WordTree();
tree.addWord("world");
tree.addWord("work");
tree.addWord("wolf");
tree.addWord("life");
tree.addWord("love");
System.out.println(tree.getWordsForPrefix("wo"));
}
}

You would need to use a List
List<String> myList = new ArrayList<String>();
if(matchingStringFound)
myList.add(stringToAdd);

After your for loop, add a call to printAllStringsInTrie(current, s);
void printAllStringsInTrie(Node t, String prefix) {
if (t.current_marker) System.out.println(prefix);
for (int i = 0; i < t.child.length; i++) {
if (t.child[i] != null) {
printAllStringsInTrie(t.child[i], prefix + ('a' + i)); // does + work on (String, char)?
}
}
}

The below recursive code can be used where your TrieNode is like this:
This code works fine.
TrieNode(char c)
{
this.con=c;
this.isEnd=false;
list=new ArrayList<TrieNode>();
count=0;
}
//--------------------------------------------------
public void Print(TrieNode root1, ArrayList<Character> path)
{
if(root1==null)
return;
if(root1.isEnd==true)
{
//print the entire path
ListIterator<Character> itr1=path.listIterator();
while(itr1.hasNext())
{
System.out.print(itr1.next());
}
System.out.println();
return;
}
else{
ListIterator<TrieNode> itr=root1.list.listIterator();
while(itr.hasNext())
{
TrieNode child=itr.next();
path.add(child.con);
Print(child,path);
path.remove(path.size()-1);
}
}

Simple recursive DFS algorithm can be used to find all words for a given prefix.
Sample Trie Node:
static class TrieNode {
Map<Character, TrieNode> children = new HashMap<>();
boolean isWord = false;
}
Method to find all words for a given prefix:
static List<String> findAllWordsForPrefix(String prefix, TrieNode root) {
List<String> words = new ArrayList<>();
TrieNode current = root;
for(Character c: prefix.toCharArray()) {
TrieNode nextNode = current.children.get(c);
if(nextNode == null) return words;
current = nextNode;
}
if(!current.children.isEmpty()) {
findAllWordsForPrefixRecursively(prefix, current, words);
} else {
if(current.isWord) words.add(prefix);
}
return words;
}
static void findAllWordsForPrefixRecursively(String prefix, TrieNode node, List<String> words) {
if(node.isWord) words.add(prefix);
if(node.children.isEmpty()) {
return;
}
for(Character c: node.children.keySet()) {
findAllWordsForPrefixRecursively(prefix + c, node.children.get(c), words);
}
}
Complete code can be found at below:
TrieDataStructure Example

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to implement Autocomplete using Trie with a HashMap? - java

Related

How to find the longest word in a prefix tree recursiverly?

Trie Recursion Strange Output

AutoComplete using a Trie in Java

Get Words out of a Trie Data Structure

Getting a list of words from a Trie

Categories

Resources