This is where I brutalize the masses.
Computer Science
Data Parallel Acceleration of Decision Support Queries Using Cell/BE and GPUs
Mar 8th
Firstly, a small introduction on the processing units. We all know what a GPU (Graphics Processing Unit) is. It is the processing unit with purpose to render 3D graphics, most commonly found on video cards. What might be a new definition is the Cell/BE processing unit. Cell/BE is a processor developed by STI (Sony, Toshiba, IBM) and it is the processor which is used inside the Playstation 3 console.
Back in the good old days when people designed CPUs two CPUs with identical architecture could run at different speeds by changing their clock frequency. As computing needs increased we kept increasing the CPU frequency, until we reached a dead end, where increasing the speed further resulted in such high temperatures and such low increase in throughput that we had to find other ways, and that gave birth to multi-cores.
Homogeneous multi-core processors such as Intel or AMD Quad Cores, Core Duo, I3/I5, etc provide a small number of identical cores that share cache and are generally easy to program using POSIX Threads or OpenMP directives. GPUs on the other hand include a large number of very basic cores. GPUs are highly multithreading, to help fast rendering of 3D graphics. However, recently people started exploiting this feature of GPUs to have them process applications that have nothing to do with graphics. For example, NVIDIA GPUs can be easily programmed with CUDA. Lastly, the wonderful Cell/BE, a heterogeneous multi-core processing unit, with one general purpose PowerPC Processor Element (PPE) and eight special purpose Synergistic Processing Elements (SPEs). Cell has some real potential, which is a big reason on why people say that the PS3 is by far more technologically advanced than the XBOX, but it is also tough to program, which is the reason why much fewer games were released for the PS3, especially in the early days of its deployment. In fact, the world’s second most powerful supercomputer at this time of writing, the IBM RoadRunner, has a peak performance of 1.7 petaflops and uses Cell processors.
Onwards to experiments. To make things easier for programmers, a wonderful company called RapidMind (now acquired by Intel) developed a software product that allowed developers to target any processing unit they like. Software written in a parallel programming fashion could be compiled with RapidMind for any target processing unit to take full advantage of it.
Now consider a database management system where we wish to perform a simple nested loop on our data to join a foreign key with our table primary key to retrieve attributes. That’s an O(n^2) algorithm right there, which for large tables would require not very pleasing time, if you are looking at an interactive application such as a web-site. However, with this algorithm, we can run all the executions of the inner loop which is O(n) in parallel, saving us a lot of time. On top of that, if we hash our keys then the inner loop becomes an O(log n).
Of course, there are details to consider. Cell/BE appears to fail to provide a significant increase in speed but that’s until you realise that due to its unique architecture. The Cell/BE processors requires tuning. In fact there is an entire chapter in the RapidMind documentation on Cell/BE. Follow that and you can see how the processor can perform some real wonders. Further more if you realise that GPUs were designed for gaming so they work on packages of 4 floats (3 for coordinates and one for depth relevant to the camera) then you’ll find that you can gain significant speed up if you use your data as such.
To summarize things up, general-purpose multi-core CPUs have small scalability, but are simple to program and give overall best performance. For specialized tasks however which use intense data and processing algorithms and can run in parallel, processing units with more cores would be much more efficient. For example web servers that should handle a lot of client requests in parallel could be much more efficient to run on GPUs (experiment needed).
Size complexity of two-way finite automata
Mar 4th
Finite automata have been so far investigated much by researchers but only when it comes to computability. Complexity questions have only been addressed sporadically in the past. It has been found that for a given class of problems solvable by a Turing Machine that can not write (i.e. a finite automaton) a non-deterministic finite automaton would require O(h) number of states, whereas a deterministic finite automaton would require O(2^(2h^2)), where h is the size of the input. What this means is that a non-deterministic automaton grows in space linearly, while a deterministic one grows exponentially, in such way that for h=16 it would require space equivalent to all the matter in the universe. Non-determinism clearly wins.
However, can it be proved that there is no way for deterministic finite automata to solve such a general problem in less space? The answer is maybe, because neither that it can nor that it can’t has been proven yet. In fact, the generalisation of the problem would be the famous P = NP problem. Finding an algorithm that can deterministically solve a general problem even in polynomial time instead of exponential can change our everyday lives. It will make industry problems easily solvable by computers, it will render cryptography useless, etc. Afterall, it is no joke that the Clay Mathematics Institute of Cambridge, Massachusetts (CMI) is offering one million dollars to whoever solves P = NP.
So now that we have given a short definition of the problem and the motivation behind, let’s talk a little bit in more depth. The question here is:
“Can 2DFAs always stay at most polynomially larger than 2NFAs?”
This is one of the most interesting and hard questions when it comes to researching size complexity of two-way finite automata. In 1978, Sakoda and Sisper proposed a theory that starts with the class 2D of all families of regular problems that can be solved by 2DFAs of polynomially growing size, and a similar class 2N defined for 2NFAs.
In order to elaborate some more on this, consider a problem such as: A Turing Machine gets input where each cells receives a two-column graph with the same constant number of nodes per column. Considering the multi-column graph produced by merging the graphs of adjacent cells, the TM has to compute whether there exists a path from a node in the leftmost column to a node in the rightmost column. It is easy to prove that this problem can be easily solved by a 2NFA in polynomial time, and therefore lies in the class 2N as defined earlier. The question is, can it be also proved for 2DFAs such that 2N = 2D? Sakoda and Sisper went on to introduce the so called homomorphic reductions between problem families, proved that 2D is closed under them and identified a particular family in 2N that is complete with respect to them, the so called “two-way liveness”. However, as complex as this may make the question above seem, it is still tangible, and solving it brings us one step closer to solving the P = NP problem.
A step into usability
Mar 2nd
Usability is a very important research topic in the field of Computer Science. A functional system is not worth much unless it is also usable. According to Shackel (1991):
“The usability of a system is the capability in human functional terms to be used easily and effectively by the specified range of users, given specified training and user support, to fulfil the specified range of tasks, within the specified range of scenarios.”
Usability engineering (also known as Human Computer Interaction) is “a discipline concerned with the design, evaluation, and implementation of interactive computing systems for human use and the study of major phenomena surrounding them.”
So how exactly can we define and measure usability? The term “usability” replaced the term “user-friendly” which is a very vague term used in the past. We can think of usability in two ways:
- Usability is a relative property of the system since it depends on the perception capabilities of its users. Evaluation of usability therefore can be subjective
- Usability is a set of objective measures of interaction.
Shackel went on and broke down usability into four categories:
- Effectiveness: performance in accomplishment of tasks
- Learnability: degree of learning to accomplish tasks
- Flexibility: adaption to variation in tasks
- Attitude: user satisfaction with the system
Similarly, two years later, Nielsen (1993) broke down usability into the following categories:
- Learnability: How easy it is to learn to use the system.
- Efficiency: The system allows for high productivity once the user learns how to use it.
- Memorability: The system should be easy to remember so that returning users will not have to go through the learning phase again.
- Errors: The system should have a low error rate when it comes to users perceiving how they should accomplish a task. The system should also allow for recovery from these errors.
- Satisfaction: The system should be pleasant to use.
In order to measure usability we need to establish some usability evaluation methods (UEMs). According to Zhang (2001) there are three broad types of UEMs:
- Testing
- Inspection
- Inquiry
Each of these types consists of different methods, but each type has its own characteristics for all its methods. Below is a brief explanation of these types:
- Usability testing is the method of selecting a few representative users to work on typical tasks and therefore help the evaluators get an idea of how usable the system will be in practice. Such methods include the coaching method, co-discovery learning, performance measurement, question-asking protocol, remote testing and the thinking aloud protocol.
- Usability inspection requires usability specialists and software developers to examine and judge whether the usability features of a system follow professional and established usability principles. Such methods include heuristic evaluation, cognitive walkthrough, feature inspection, pluralistic walkthrough and standards inspection.
- Lastly, usability inquiry requires usability evaluators to obtain information about users likes, dislikes, needs and understanding of the system by communicating with them. Such methods include field observation, interviews, focus groups, surveys and logging actual use.
Usability needs to be measured in order for the software developers to be able to define the need for improvement and its type. Current practices for measuring usability can be broken down into measures of effectiveness, efficiency and satisfaction.
Examples of measures of effectiveness:
- Binary task completion (number of tasks users succeeded, failed within a certain time, or gave up)
- Accuracy measures (quantify the number of errors users make while in the process of completing a task)
- Recall measures (how much information users can recall after the use of an interface)
Examples of measures of efficiency:
- Time (how long the users need to complete tasks)
- Input rate (e.g. words per minute)
- Mental effort (mental resources spent, e.g. heart rate variability)
- Usage patterns (number of times a certain action has been performed, how much information users access, deviation from optimal solution, etc)
Examples of measures of satisfaction:
- Preference measures (capture which interfaces users prefer using)
- Specific attitudes towards the interfaces (liking, fun, annoyance, etc)
- Perception of outcomes (users’ assessment of their performance, users’ perception of learning, users’ confidence in the solution to tasks)