Gigi Labs

Please follow Gigi Labs for the latest articles.
Showing posts with label generics. Show all posts
Showing posts with label generics. Show all posts

Saturday, June 15, 2013

Indexing and Search (C#)

Hello once again, and welcome to this new article at Programmer's Ranch! :)

Today we're going to talking about indexing and search: what is it that allows you to search through thousands of documents in less than a second?

In order to understand how indexing works, we will use an example where we have the following three documents:

        doc1.txt contains:
        The Three Little Pigs

        doc2.txt contains:
        The Little Red Riding Hood

        doc3.txt contains:
        Beauty And The Beast

Now, let's say you want to find out which of these documents contain a particular word, e.g. "Little". The easiest way to do this would be to go through each word in each document, one by one, and see if the word "Little" is in that document. Conceptually, we're talking about this:


Doing this in C# is very easy:

            String[] documents = { "doc1.txt""doc2.txt""doc3.txt" };
           
            String keyword = Console.ReadLine();
           
            foreach (String document in documents)
            {
                if (File.ReadAllText(document).Contains(keyword))
                {
                    Console.WriteLine(document);
                }
            }

            Console.ReadKey(true);

Remember to put in a using System.IO; at the top, to be able to use File. If you press F5 and test this program, you'll see that it works:


However, this method isn't good because it will take longer as the documents get larger (more words) and more numerous (more documents).

The proper way to process search requests quickly is to build an index. This would look something like this:


The index stores a list of all words, each with a list of documents that contain it. If you compare it with the first diagram, you'll notice that we reversed the mapping of words and documents; this is why we call this an inverted index.

We can do this in C# by first building the index (remember to add using System.Collections.Generic; at the top):

            // Build the index
           
            String[] documents = { "doc1.txt""doc2.txt""doc3.txt" };
            Dictionary<String, List<String>> index = new Dictionary<String, List<String>>();
           
            foreach (String document in documents)
            {
                String documentStr = File.ReadAllText(document);
                String[] words = documentStr.Split();
               
                foreach (String word in words)
                {
                    if (!index.ContainsKey(word))
                        index[word] = new List<String>();
                   
                    index[word].Add(document);
                }
            }

...and then using the index to search the documents quickly and efficiently:

            // Query the index
           
            String keyword = Console.ReadLine();
           
            if (index.ContainsKey(keyword))
            {
                foreach (String document in index[keyword])
                {
                    Console.WriteLine(document);
                }
            }
            else
                Console.WriteLine("Not found!");

            Console.ReadKey(true);

In this way, there is no need to search every document for the keyword each time the user wants to search for a word. The keyword is simply located in the index (if it exists), and a list of documents that contain it is immediately available:


This was a simple proof of concept of how indexing and search works, but here are a few additional notes:

  • The index is usually built as documents are added to it, and then stored in one or more files (unlike in this program, where the index is rebuilt every time the program is run - that's just to make the illustration easier).
  • Words such as "and" and "the" which are very common are called stop words and are normally excluded from the index.
  • It is common practice to make searches case insensitive, e.g. by converting indexed words and query keywords to lowercase.

This article presented the concept of indexing and how it is used to search for a single keyword. Although there are other techniques used in text search, indexing is definitely one of the most important, and has many applications including databases and text retrieval (e.g. search engines). A fundamental concept to remember is that the whole point of indexing is to make search fast!

Thanks for reading! :) Feel free to leave feedback (positive or negative) either in the comments or by contacting me. Remember to subscribe to the Programmer's Ranch Facebook page to stay up to date with articles as they are published!

Wednesday, May 15, 2013

C# Basics: Snake Game in ASCII Art (Part 2)

Hi all! :)

In yesterday's article, C# Basics: Snake Game in ASCII Art (Part 1), we created the first part of our Snake Ranch game (a clone of Snake) and learned to use structs. Today we will learn to use lists in order to allow the snake to grow.

At the beginning of yesterday's article, we mentioned that we needed to store each part of the snake in a list, as follows:


The head of the snake would be at (1,1), for example. The remaining parts trail behind it in sequence. We can store these locations in a list, in the following fashion:


There is a List collection in .NET that allows us to do this kind of thing. As with dictionaries (see "C# Basics: Morse Code Converter Using Dictionaries", we first need to include System.Collections.Generic first:

using System.Collections.Generic;

We can now declare a new List of Location objects, and put the head as the first item:

            List<Location> snake = new List<Location>();
            snake.Add(head);

We now change the code that shows the head (from yesterday's article) to show the entire snake in the list:

                // show snake
               
                foreach (Location location in snake)
                {
                    Console.SetCursorPosition(location.X, location.Y);
                    Console.ForegroundColor = ConsoleColor.White;
                    Console.Write((char178);
                    Console.ResetColor();
                }

All we're doing here is going over each element in the list and showing a portion of the snake there. For now we only have the head though.

Before we make the snake grow, we first need to understand what happens when it moves. Consider this again:


Let's say the head of the snake is at (1,1), and it moves to the left. Then the head becomes (0,1). The tail (last part) of the snake goes away, and all the parts between the new head and the old tail remain unchanged. We need to make some changes to our code to handle this properly. First, before the while loop, we declare a new Location variable called next:

            Location next;

This will hold the position of the new head, based on the keypress received. In the example above, the head is currently at (1,1), but if the user presses the left arrow, then we want next to contain (0,1).

At the beginning of the while loop, we add the following:

                next = snake[0];
               
                Console.Clear();

The first line assigns next to the location of the current head, stored in the first element of the snake variable. Lists can be accessed using [] notation just like arrays.

Next, we change the input handling logic to modify next instead of head:

                switch(keyInfo.Key)
                {
                    case ConsoleKey.Escape:
                        return;
                    case ConsoleKey.UpArrow:
                        if (next.Y > 0)
                            next.Y--;
                        break;
                    case ConsoleKey.DownArrow:
                        if (next.Y < 24)
                            next.Y++;
                        break;
                    case ConsoleKey.LeftArrow:
                        if (next.X > 0)
                            next.X--;
                        break;
                    case ConsoleKey.RightArrow:
                        if (next.X < 79)
                            next.X++;
                        break;
                }

After that, we can do the following:

                snake.Insert(0, next);
                snake.RemoveAt(snake.Count - 1);

Here we are doing exactly what I explained earlier: adding a new head at the beginning of the snake, and removing the old tail. If you press F5 now, you should have functionality just like what we had at the end of yesterday's article (because we still have just one part of the snake):


All we have left to do now is to make the snake get longer when he eats a star. At the beginning, add a new variable to store the location of the star:

Location star = new Location(6020);

After showing the snake, add the following code to show the star:

                // show star
               
                Console.SetCursorPosition(star.X, star.Y);
                Console.ForegroundColor = ConsoleColor.Yellow;
                Console.Write('*');
                Console.ResetColor();

Finally, replace the last two lines in the while loop (Insert() and RemoveAt()) with the following logic:

                snake.Insert(0, next);
                if (next.X == star.X && next.Y == star.Y)
                {
                    Random random = new Random();
                    star.X = random.Next(080);
                    star.Y = random.Next(025);
                }
                else
                    snake.RemoveAt(snake.Count - 1);

This code is based on the code from Chase the Star (see "C#: ASCII Art Game (Part 2)"). If the snake bumps into the star, the star moves away to some other random location. Here we are using structs instead of separate starX and starY variables.

The most important thing to understand here is that if the player bumps into the star, we don't remove the snake's tail, thus allowing it to grow. We can now test it:



Awesome! :) We have just made our own variant of Snake using ASCII art. If you test it thoroughly, you'll notice there are a few problems:

  1. The snake doesn't move on its own as time goes by. You need to press an arrow key for it to move.
  2. The snake can bump into itself or the edges and nothing bad happens.
  3. If the snake moves into the edge, it sort of compresses. When you move away, it regains its former length.
  4. The star can appear over the snake.
  5. No obstacles... boring!

As you can see, it can be quite complicated to make a complete game, as there are many things you need to take care of. Making a fully blown Snake game would take many more articles, so I'll leave it at that. In this article we use generic lists to implement the Snake concept. If you're really adventurous, you might want to try and fixing some of the issues above as an exercise. Don't worry if you don't manage - just thinking about strategies to solve them will teach you how to think like a programmer.

I hope you enjoyed this, and come back for more programming articles! :)

Here is the full code for Snake Ranch:

using System;
using System.Collections.Generic;

namespace CsSnake
{
    struct Location
    {
        public int X;
        public int Y;
       
        public Location(int x, int y)
        {
            this.X = x;
            this.Y = y;
        }
    };
   
    class Program
    {
        public static void Main(string[] args)
        {
            Console.OutputEncoding = System.Text.Encoding.GetEncoding(1252);
            Console.Title = "Snake Ranch";
           
            Location head = new Location(4012);
            List<Location> snake = new List<Location>();
            snake.Add(head);
           
            Location next;

            Location star = new Location(6020);
           
            while (true)
            {
                next = snake[0];
               
                Console.Clear();
               
                // show snake
               
                foreach (Location location in snake)
                {
                    Console.SetCursorPosition(location.X, location.Y);
                    Console.ForegroundColor = ConsoleColor.White;
                    Console.Write((char178);
                    Console.ResetColor();
                }
               
                // show star
               
                Console.SetCursorPosition(star.X, star.Y);
                Console.ForegroundColor = ConsoleColor.Yellow;
                Console.Write('*');
                Console.ResetColor();
               
                // handle input
               
                ConsoleKeyInfo keyInfo = Console.ReadKey(true);
               
                switch(keyInfo.Key)
                {
                    case ConsoleKey.Escape:
                        return;
                    case ConsoleKey.UpArrow:
                        if (next.Y > 0)
                            next.Y--;
                        break;
                    case ConsoleKey.DownArrow:
                        if (next.Y < 24)
                            next.Y++;
                        break;
                    case ConsoleKey.LeftArrow:
                        if (next.X > 0)
                            next.X--;
                        break;
                    case ConsoleKey.RightArrow:
                        if (next.X < 79)
                            next.X++;
                        break;
                }
               
                snake.Insert(0, next);
                if (next.X == star.X && next.Y == star.Y)
                {
                    Random random = new Random();
                    star.X = random.Next(080);
                    star.Y = random.Next(025);
                }
                else
                    snake.RemoveAt(snake.Count - 1);
            }
        }
    }
}

Sunday, May 12, 2013

C# Basics: Morse Code Converter Using Dictionaries

Hi all! :)

Today's article is an easy one. We're going to learn to use dictionaries, and use them to make a program that converts text into the Morse Code equivalent.

Just for a change, today I'm going to use Visual Studio 2010 instead of SharpDevelop. If you want to try it, grab an express edition from Microsoft's website. Otherwise, you can manage just as well with SharpDevelop. Let's start off by creating a new console application. In Visual Studio, you can click on the "New Project..." link below the VS logo, or else from the menu, File -> New -> Project...:


Under Visual C# -> Windows, select "Console Application". As with SharpDevelop, specify a name for the project and where to put it. Note: I'm using the Ultimate edition, so your Express edition will be a little different (e.g. less project types).


The sample code given by Visual Studio is a bit different from that of SharpDevelop:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace CsMorse
{
    class Program
    {
        static void Main(string[] args)
        {
        }
    }
}


In particular, there's nothing in Main(). It doesn't matter. However, if you're using SharpDevelop, you're going to need the using System.Collections.Generic; so make sure you add it.

So, now we're going to make a little program based on the Morse Code. In case you don't know, the Morse Code is something that Chuck Norris used to communicate with the aliens who built the pyramids in Egypt. Obviously, he was the one telling them how to construct them.

As you can see from the Wikipedia link in the previous paragraph, each letter of the alphabet has a corresponding representation in Morse as a series of dots and dashes. In order to speed up transmissions, more common characters (e.g. 'E') are much shorter than others.

In C#, we can use a dictionary to map keys (e.g. 'L') to values (e.g. ".-.."). In other programming languages, dictionaries are sometimes called hash tables or maps or associative arrays. The following is an example of a dictionary mapping the first two letters of the alphabet to their Morse equivalents:

            Dictionary<char, String> morse = new Dictionary<char, String>();
            morse.Add('A', ".-");
            morse.Add('B', "-...");

            Console.WriteLine(morse['A']);
            Console.WriteLine(morse['B']);

            Console.WriteLine("Press any key...");
            Console.ReadKey(false);

First, we are declaring a dictionary. A dictionary is a generic type, so we need to tell in the <> part which data types we are storing. In this case, we have a char key and a String value. We can then add various items, supplying the key and value to the Add() method. Finally, we get values just like we would access an array: using the [] syntax. Just that dictionaries aren't restricted to using integers as keys; you can use any data type you like. Note: you'll know from the previous article, "The ASCII Table (C#)", that a character can be directly converted to an integer. Dictionaries work just as well if you use other data types, such as Strings.

Here is the output:


If you try to access a key that doesn't exist, such as morse['C'], you'll get a KeyNotFoundException. You can check whether a key exists using ContainsKey():

            if (morse.ContainsKey('C'))
                Console.WriteLine(morse['C']);

OK. Before we build our Morse converter, you should know that there are several ways of populating a dictionary. One is the Add() method we have seen above. Another is to assign values directly:

            morse['A'] = ".-";
            morse['B'] = "-...";

You can also use collection initialiser syntax to set several values at once:

            Dictionary<char, String> morse = new Dictionary<char, String>()
            {
                {'A' , ".-"},
                {'B' , "-..."}
            };

Since we only need to set the Morse mapping once, I'm going to use this method. Don't forget the semicolon at the end! Replace your current code with the following:

            Dictionary<char, String> morse = new Dictionary<char, String>()
            {
                {'A' , ".-"},
                {'B' , "-..."},
                {'C' , "-.-."},
                {'D' , "-.."},
                {'E' , "."},
                {'F' , "..-."},
                {'G' , "--."},
                {'H' , "...."},
                {'I' , ".."},
                {'J' , ".---"},
                {'K' , "-.-"},
                {'L' , ".-.."},
                {'M' , "--"},
                {'N' , "-."},
                {'O' , "---"},
                {'P' , ".--."},
                {'Q' , "--.-"},
                {'R' , ".-."},
                {'S' , "..."},
                {'T' , "-"},
                {'U' , "..-"},
                {'V' , "...-"},
                {'W' , ".--"},
                {'X' , "-..-"},
                {'Y' , "-.--"},
                {'Z' , "--.."},
                {'0' , "-----"},
                {'1' , ".----"},
                {'2' , "..---"},
                {'3' , "...--"},
                {'4' , "....-"},
                {'5' , "....."},
                {'6' , "-...."},
                {'7' , "--..."},
                {'8' , "---.."},
                {'9' , "----."},
            };

           

            Console.WriteLine("Press any key...");
            Console.ReadKey(false);


In the empty space between the dictionary and the Console.WriteLine(), we can now accept user input and convert it to Morse:

            Console.WriteLine("Write something:");
            String input = Console.ReadLine();
            input = input.ToUpper();

            for (int i = 0; i < input.Length; i++)
            {
                if (i > 0)
                    Console.Write('/');

                char c = input[i];
                if (morse.ContainsKey(c))
                    Console.Write(morse[c]);
            }

            Console.WriteLine();

Here, the user writes something and it is stored in the input variable. We then convert this to uppercase because the keys in our dictionary are uppercase. Then we loop over each character in the input String, and write its Morse equivalent if it exists. We separate different characters in the Morse output by a forward slash (/). Here's the output:


Awesome! :) In this article we used Visual Studio to create a program that converts alphanumeric text into the Morse-encoded equivalent. In the process, we learned to use dictionaries, and also revisited things like for loops and String methods.

Watch this space for more practical articles! In particular, after a few more articles, we'll be doing some network programming. Some great stuff is coming your way. :)