CSCE 624: Sketch Recognition: January 2013

Tuesday, January 29, 2013

Reading Assignment: Gestures without Libraries, Toolkits, or Training: A $1 Recognizer for User Interface Prototypes

Reference Information
Title: Gestures without Libraries, Toolkits, or Training: A $1 Recognizer for User Interface Prototypes
Authors: Jacob O. Wobbrock, Andrew D. Wilson, Yang Li
Citation: "Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes", Jacob O. Wobbrock, Andrew D. Wilson, Yang Li, Proceedings of the 20th annual ACM symposium on User interface software and technology, pp. 159-168, 2007.

Summary
This paper discussed the $1 recognizer, a gesture recognition algorithm designed to be cheap, simple, and easy-to-use such that even novice programmers can include it in their own interface systems. Many gesture recognition algorithms rely on complicated math (such as that presented in the Rubine paper) or computationally expensive methods that limit the number of programmers that can implement such a system.

When given a gesture as input, the the $1 recognizer works through a series of four steps to recognize the given gesture. The first step resamples the path of the gesture to produce a path with N equally-spaced points. Step two rotates the gesture based on an angle found using a seed and search approach. Step three scales the gesture non-uniformly, then translates it to a particular reference point. Finally, step four conducts the recognition by comparing the modified gesture to a set of stored templates. Limitations of this recognizer include requiring comparison to templates, being rotation, scale, and position invariant, and being unaware of time as related to the gestures.

The $1 recognizer was tested against two other algorithms, the Rubine classifier and Dynamic Time Warping (DTW). Tests were conducted by having users provide a series of gestures at varying speeds, then using the three algorithms to recognize the gestures. It was determined that medium speed gestures are recognized more accurately than slow or fast gestures, most likely due to the balance between speed and accuracy. It was also determined that the $1 recognizer had accuracies similar to DTW and better than Rubine for the experiment that was conducted.

Thoughts
I think that it's a great idea to provide a recognition algorithm that is simple, easy-to-understand, and easy for novice programmers to include in their own work. This could serve to not only increase awareness of gesture recognition, but also to increase the number and range of ideas surrounding gesture recognition by having systems implemented by a much wider range of programmers with differing backgrounds.

This paper made for a very good read, and it was easy to understand the workings of the algorithm. It made many references to the Rubine paper that was also assigned as reading for this course, so having previously read that paper made it much easier to understand some of the motivations and the structure of the recognizer mentioned here. In addition, having been written in 2007, this paper is much more current than the others that we have read so far. While it is nice to see the foundations of sketch recognition, it was also nice to read of some more current technology and how it actually applied the foundations that we have learned.

One problem that I have with the paper is that it repeatedly mentioned a major goal of the recognizer being that it should be easily usable by novice programmers; however, it is later mentioned that the programming ease has yet to be tested. Therefore, it is unknown whether or not the algorithm actually accomplishes this major goal that was set for it. However, the evaluation that was provided was very helpful in seeing how this recognizer compares with others.

Monday, January 28, 2013

Reading Assignment: Specifying Gestures by Example

Reference Information
Title: Specifying Gestures by Example
Author: Dean Rubine
Citation: "Specifying Gestures by Example", Dean Rubine, SIGGRAPH '91 Proceedings of the 18th anuual conference on Computer graphics and interactive techniques, pp. 329-337, ACM New York, NY, USA, 1991.

Summary
This paper discussed gesture-based interfaces, specifically GRANDMA, which is an object-oriented toolkit for applications with direct manipulation interfaces. It allows for gestures to be added to the interface without being hand coded. A gesture in this sense is a stroke made by a device such as a stylus or a mouse. The gesture recognition toolkit results in a recognizer that is trained from examples of gestures to be able to recognize new gestures that are input to the system.

A gesture-based application, GDP, was described and the GRANDMA toolkit was used to provide gesture recognition for the interface. Gesture classes were used for sets of associated gestures, and are arranged into a hierarchical structure. The GRANDMA toolkit works similar to the Model/View/Controller format, associating an input handler with a view class in order to provide all of its instances and subclasses with access to it.

A limitation of GRANDMA includes the fact that only single stroke gestures are allowed, eliminating the possibility of using more complex symbols. However, it allows for faster recognition, accomplished with a two-phase interaction technique (combining both gestures and the direct manipulation property of the interface) and eager recognition (recognition of unambiguous gestures). Multi-finger recognition was implemented by processing each finger's stroke as a separate, single stroke then combining them to create a multi-path gesture.

Gesture recognition occurs by first calculating a set of features based off of the various properties (i.e. angles, lengths, etc.) of the gesture, then using each feature to classify the given gesture into one of a set of gesture classes. The classifier is trained using a set of example gestures with an appropriate variance.

The importance of gesture-based interfaces was emphasized multiple times throughout the paper, namely for the ability to improve interactions between humans and computers. It was hoped that this may encourage further integration of gesture recognition in interfaces.

Thoughts
This paper provided a great deal of information regarding gesture recognition. I found it to be very helpful for understanding some of the basic problems and approaches associated with such recognition systems. Published in 1991, this paper strived to encourage further interactions between humans and computers, namely with gesture recognition techniques, including multi-finger touch recognition. This is a topic that is still being emphasized today, although improvements and wider usage has occurred. The system presented, GRANDMA, is an object-oriented system that can apply a hierarchical structure to classes of gestures. The object-oriented nature with the hierarchical structure reminded me of the Sketchpad paper read previously, with its usage of hierarchical structures to easily organize the system and to provide simple extensibility.

An important point of this paper is that it presented a simple, fast gesture recognition algorithm. The extensive use of features for classifying various attributes of a particular gesture distinguished the different properties of a stroke that can be used to compare the differences of various classes of gestures. The use of a classifier for recognizing gestures was simple and easy-to-understand, despite the mathematics associated with calculating the features. The simplicity of this algorithm, combined with its extensibility, provided a foundation for further gesture recognition systems to build upon.

Tuesday, January 22, 2013

Reading Assignment: Sketchpad: A Man-Machine Graphical Communication System

Reference Information
Title: Sketchpad: A Man-Machine Graphical Communication System
Author: Ivan E. Sutherland
Citation: "Sketchpad A Man-Machine Graphical Communication System", Ivan E. Sutherland, DAC '64 Proceedings of the SHARE design automation workshop, pp. 6.329-6.346, ACM New York, NY, USA, 1964.

Summary
This paper discussed Sketchpad, a system designed for allowing users to create line drawings on a computer. The interaction was accomplished by using a light pen for indicating points on the screen and a set of push buttons for accomplishing various commands. The image could be created, manipulated (such as zooming and rotating), and observed on a display screen. An image can consist of any number of subpictures consisting of various symbols. Instances of drawings can be copied. In addition, constraints can be applied to parts of the drawing in order to apply mathematical and geometrical conditions to the image. The structure of the pictures is stored in memory using a ring structure for maintaining the information about a drawing. A generic, hierarchical structure was used for implementing Sketchpad, in which generic functions exist that call more specific subroutines, such that specific operations can be easily added to the system and be called by the generic functions.

The system is designed to aid users with designing and drawing images. It is useful for storing and modifying drawings, increasing understanding of complicated designs, and creating repetitive drawings. In particular, it is useful within fields that could benefit from an easy way to understand and duplicate images, such as engineering.

Critique
Sketchpad seems to be a very important development in computer science, particularly in the areas of sketch recognition and human-computer interaction. Reading about it now, the use of push buttons seems rather out-dated and cumbersome compared to the touch screens that we currently have, but the ability to use a computer to create drawings had to start somewhere. It introduced a novel way of interacting with computers, using a light pen and push buttons to create images on a screen. This is a form of interaction that is often taken for granted in current times, where pens and fingers can be used with touch screens to draw out images with a computer. It is amazing to think that Sketchpad provided these abilities to draw and manipulate images on a machine back in 1964. This paper really shows how far we have come in that time, and yet the similarities that occur despite the time passing shows the importance of the ideas discussed with the Sketchpad system. It mentioned ideas that are now fully fledged systems that are used on a day-to-day basis in many different fields, such as software for aiding both artistic drawings (such as animations) and engineering designs (such as bridge design, circuit design, etc.). Reading this paper makes me interested in learning just how the ideas mentioned in this paper affected the future of computer science and how we have gotten from a system like Sketchpad to the systems that we have today.

I also found it interesting that a generic structure was emphasized for the implementation of Sketchpad, such that a hierarchical structure exists going from general to more specific functions. This was done to provide the ability to easily extend the system. While these ideas have been around for a long time, they are continuously being highlighted in many programming methods today, specifically with object-oriented programming approaches.

Wednesday, January 16, 2013

Introduction

E-mail address: shoffmann@neo.tamu.edu

Class standing: 2nd year Master's student

Why are you taking this class?
Sketch recognition sounds like an interesting topic, so I would like to learn more about it. Also, I have had classes with Dr. Hammond in the past and really enjoyed them, so I thought it'd be nice to take another class taught by her.

What experience do you bring to this class?
I have experience with multiple programming languages, especially C++ and Java, from working on various types of projects (including design, AI, game development, etc.)

What are your professional life goals?
I would like to work in the game development industry. Within that field, I simply want to program, preferably working in areas that I enjoy such as dealing with artificial intelligence or human-computer interaction. In addition, I would like to be able to inspire and help others in the field of computer science.

What are your personal life goals?
My personal goals are to achieve my professional goals and to make decisions that both make me happy and that I can be proud of.

What do you want to do after you graduate?
After graduation, I will be working as a software developer for a large software company.

What do you expect to be doing in 10 years?
I expect to still be programming in 10 years. I would like to be developing video games, hopefully doing something that allows me to apply both my knowledge of AI and physics within that field.

What do you think will be the next biggest technological advancement in computer science?
I think that there will be advancements in how we actually interact with devices. Recently, we have had smart phones, tablets, and all other sorts of mobile devices become the "next big thing". While a new type of device is entirely possible as the next advancement, I think that we're more in need of improvements on how we actually interact with and perceive interacting with the devices that we currently use.

If you could travel back in time, who would you like to meet and why?
There are so many great people that have influenced the world to be how it is today, so I don't think that I could pick just one person to meet. It would have to be someone with great, novel ideas that maybe was not entirely understood at the time so that I could discuss with them how and why they came up with those ideas. But if I could travel back in time, why can't I travel forwards, as well? I think that would be much more entertaining, to see how things have changed and improved in a future that I would not normally be able to see within my lifetime.

Describe your favorite shoes and why they are your favorite?
My favorite shoes are my purple Converse, because they are purple. And cool.

If you could be fluent in any foreign language that you're not already fluent in, which one would it be and why?
I would like to be fluent in German, because I find it to be a very interesting language that sounds really cool.

Give some interesting fact/story about yourself.
I have a physics minor that I'm still trying to find a good use for.