Thoughts on Programming

October 13, 2011

Trie data structure and implementation of “Auto complete”

Filed under: C — shadiyya @ 10:04 am
Tags: , ,

A trie is a data structure that stores the information about the contents of each node in the path from the root to the node, rather than the node itself. That means its position in the tree shows what key it is associated with. Searching a word of length m in a trie is having a time complexity of o(m) and are more space efficient when they contain a large number of short keys. Each node contains an array of pointers, one pointer for each character in the alphabet. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string.

For example, in the case of alphabetical keys, each node is a structure having two members. One is an integer for storing the value corresponding to a word if a word terminates at that node. Second is a pointer to an array of 26 characters for each of 26 alphabet characters. For example, consider the tree shown in the below figure. In this tree, each path between the root and a child represents a key and the end of each word is denoted by storing some data in the node, so this trie contains the words “to”, “tea”, “ted”, “ten”, “i”, “in”, “inn”, “A”. Inserting a new key traverses the trie until it either reaches the end of the string, or it discovers that the trie does not contain the string. In the first case, all that is necessary is to mark the end node as being the end of a key and assigning a data into this node. In the second case, the trie must be extended with new nodes that represent the remainder of the string. 

Starting from the root node, we can check if a word exists in the trie easily by following pointers corresponding to the letters in the target word.

Implementation of Auto Complete

Auto-complete functionality is used widely over the internet and mobile apps. A lot of websites and apps try to complete our input as soon as we start typing. 

The autocompletion I have implemented returns all words starting with a given prefix prefix. If a null string is entered as the input, then it will print all the words stored in the trie. Searching uses a well known algorithm called DFS (Depth First Search). It searches the depth first and then only the side links.

Here is the complete code for autocomplete using Trie: https://bitbucket.org/fathimashadiyya/trie

Advertisements

Blog at WordPress.com.

%d bloggers like this: