Final

Here is final report for this project

http://massagan.com/Reportformat.pdf

HW7: Project Progress Review Rebuttals

Report Abstract:

Face recognition has been widely use in authenticity, security and
surveillance. However, many face recognition systems have been
implemented on software and hardware, they usually lack of real-time
response or implement on small scale embedded system. In this report,
the author is trying to implement eigenface algorithm on FPGA to
achieve small scaling and real-time response.
The report reveals the approach by following steps 1. OpenCV based, 2.
C coded, 3.Catapult C and FPGA prototype, 4. hand coded HDL code and
FPGA prototype. The overall system contains Face detection and Face
Recognition module, and Face Recognition module can divide to learning
submodule and recognition submodule. This implementation is focus in
the Face Recognition submodule. The system gets face image data and
eigen vectors from Face detection module and learning submodule
respectively.

Significance
This report is somewhat significance. When the work is done, we will
know the performance difference between each kind of face recognition
implementation. Although the work is still ongoing, the author should
provide the experiment results for openCV and C code. The results
could include how much runtime has spent on the face detection,
learning and recognition. When using the FPGA to accelerate the
computation, how much profit can be expected? What is the benefit to
choose eigen face algorithm? Why you are not interest in implement
learning part by FPGA?

Related Research
This is a prototype report paper. The author references openCV,
previous work and algorithms and implements them. In this report, I
cannot see the novelty, such as how the author modified the code to
accelerate it or meet the FPGA resources.
Theoretical and experimental evaluation and Technical detail
The report does not provide theoretical and experimental evaluation or
technical detail. Possible because it is a progress report and has the
page limit, it is hard to
describe the details in the report. It could also states what
difficulty is dealing with during the progress

Expression
Since it provides the big picture and the approach clearly, this part
lets the readers without specific background understand the overall
design easily. However, the lack of theoretical details and
experimental result results in a general article. For example, even
face recognition submodule is the key work in the design; the author
just lists the function name without any description. It should
roughly provide how the function works, what are its input and output
and the relation between other functions.

Figures and tables
Clear figures and tables are provided.

References
References are provided.

Summary
It provides the brief big picture, flow and project goal, but lacks of
details. Recommend providing more details and description, and trying
to come out some novelties which are different to the previous report.
The future work is executable; it just has to state the difficulty
happened during the progress.

Questions
1.        Why are you only interested in the face recognition submodule but
learning submodule?

Answer: The role of training is to generate training data. And generated training data is usded in recognition sub module. As I stated in my paper, in real-time systems or in hardware like FPGA, we don’t need to do learning part in HW, because, we don’t need training sub module, what we actually need is data generated by training. S
2.        Why do you choose eigenface algorithm instead other algorithm?

Answer: That is the simplest and most widely used algorithm. Above all, all face recognition algorithms are based on eigenface.
3.        Can you give a briefly description about the functions in your
design?

Answer:


In  the picture above, each stage corresponds to particular function.

4.        How do you accelerate you design? Just pure by FPGA or
other skills?

Answer: First thing is to run it on FPGA, second make some optimization such as loop unrolling, pipeling..etc.

5.        How much percentage of time does the face recognition
takes? After you accelerate it, what performance can be expected?

My preliminary result shows it accelerates 20 times, but I expect it to improve.

Hello World on Catapult

This is just an example for starters of Catapult C Synthesis tool.

I was trying to see how to implement and simple example using Catapult on FPGA. I found this post, and I think it is pretty useful for those who has little experience on hardware.

First step: Set Working Directory in Catapult, because if you don’t set it it usually sets in a directory such as temp and your files might loose.

Second step: Add File using “Add File”. For this I made an Microsoft Visual C++ project with one cpp file. The name of my cpp file is blink.cpp as below.
#pragma hls_design top
#define MAX_COUNT 50000000
bool clk_div()
{
static bool toggle_val=false;

for (int i=0; i<MAX_COUNT/2; i++)
{
if (i==MAX_COUNT/2-1)
toggle_val = !toggle_val;
}
return toggle_val;
}

#include<stdio.h>
int main()
{
printf(“Starting clock divider\n”);
for (int i=0; i<9; i++)
{
printf(“Clock value = %i\n”, clk_div());
}
return 0;
}

Third: I made a Setup Design. In this step, you can specify your hardware( in my case I am targeting FPGA Virtex 4),  Design Compiler…etc.

Fourth: We need to set some Architectural Constraints. Basically, you just click Architectural Constraints if you don’t understand in deep. In this part, you can unroll loops, or make pipeline.

Fifth: Making some Scheduling. As I understand in this part, you  can schedule your code on hardware.

Sixth: You generate RTL. In this step, the Catapult, generates several RTL files which you need to update on FPGA.

Finally, you can verify your RTL against your C++ code. To do so, go to Flow Manager window. Click on SCVerify and make Enable.

It adds Verification folder on Project Files window. Go to Verification folder and Click on MS Visual C++ 9.0. There will be a file named Original Deign +Testbench. If you double click this file you get your result.

In our case, we should get result like below.

# Starting clock divider
# Clock value = 1
# Clock value = 0
# Clock value = 1
# Clock value = 0
# Clock value = 1
# Clock value = 0
# Clock value = 1
# Clock value = 0
# Clock value = 1

Homework 4(Progress Report)

Progress report download here

Face recognition on embedded systems

In this post, I will summarize some of the most relevant related work done regarding implementation of face recognition on embedded systems.

Related work

Sajid prosposed design of high performance FPGA based face recognition system based on fixed pointtechnique with software hardware co-design(SHcoD) methodology. They stated that Eigen value computation is sensitive to floating point precision, and could be computed by using fixed point technique. Their system shows efficient power usage than floating point architecture. However, didn’t provided exact measures of PCA algorithsm such as number of images, latency [5].

Lee developed a Lego NXT based face recongition system. Their main contribution is designing mobile robot which collects face data for learning purpose under different lighting conditions [6].

They suggested  implementation of an efficient automatic face recognition system focusing for embedded handheld devices, which usually have low memory and power.They have used AD1 Blackfin processor, which makes the implementation  suitable for applications needing real-time userauthentication [7].

They have devised a face recognition algorithm named composite PCA which is more suitable for real-time face recognition for embedded systems. Composite PCA has more parallelism than conventional PCA, and this parallelisim can be exploited to design real-time face recognition. They have implemented the algorithm on FPGA board and got 4 ms of latency to recognize a face from 110 images of 10 individuals [8].

Feiz developed a real-time face recognition for smart home applications. It has face detections, model based facial feature extration, face normalization and face recognition modules. They achieved 4 frame/sec speed and 95% recognition rate [9].

Different from above works, we will implement full system including face detection and face recognition on FPGA. Face detections is done by Jung [1] on FPGA. Also, we will implement face recognition in following manner in four ways 1) with pure software, 2 ) with hand coded FPGA, 3) with Catapult C tool, 4) with GPU. After doing these four implementation, we could draw some conclusions based our implementation.

Refereences

[1] Junguk Cho, Bridget Benson, and Ryan Kastner, “Hardware Acceleration of Multi-view Face Detection,” IEEE Symposium on Application Specific Processors (SASP 2009), San Francisco, California, USA, July 27-28, 2009.

[2] Junguk Cho, Bridget Benson, Sunsern Cheamanunkul, and Ryan Kastner, “Increased Performance of FPGA-Based Color Classification System,” ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2010), to appear, 2009.

[3] Junguk Cho, Shahnam Mirzaei, and Ryan Kastner, “Real-Time Vision Processing,” the UCSD Jacob’s School of Engineering Research Expo, University of California San Diego, La jolla, California, USA, February 19, 2009.

[4] M. Turk, A. Pentland, Eigenfaces for Recognition, Journal of Cognitive Neurosicence, Vol. 3, No. 1, 1991, pp. 71-86

[5] I. Sajid, M. M. Ahmed, I. Taj, M. Humayun, and F. Hameed,  Design of High Performance FPGA Based Face Recognition System, PIERS Proceedings, Cambridge, USA, July 2{6, 2008

[6] Tae-Hoon Lee, Real-Time Face Detection and Recognition on

LEGO Mindstorms NXT Robot, LNCS , Advances in Biometrics Volume 4642/2009, Pages: 1006-1015

[7] Mao Wei and Abbas Bigdeli , Implementation of a Real-Time Automated Face Recognition System

for Portable Devices, lntwimtioiial Symposium on Communications

and Infom~ation Technologies 2004 ( ISCIT 2004 )

Sappom, Japan, Octoher 26- 29, 2004

[8] Rajkiran Gottumukkal, Vijayan K. Asari , System Level Design of Real Time Face Recognition Architecture Based on Composite PCA, Proceedings of the 13th ACM Great Lakes symposium on VLSI , Washington, D. C., USA ,2003, Pages: 157 – 160

[9] Fei Zuo, Peter H. Real-time Face Recognition for Smart Home Applications, International Conference on Consumer Electronics (ICCE2005), vol. 51 p. 183-190, February 2005, Las Vegas, U.S.A.

Bottom-up approach for real-time face recognition system implementation

One of the major limitations of today’s embedded systems are its capacity to process large amount of data in real time. Usually, this limitation was evident when embedded systems are required to do image processing  such as object detection /recognition.  Because of limited memory capacity and low CPU speed it has been burden for embedded systems to process image data in real time. One of the solutions to process image data and to give desired result in at least in soft real time is implementing image processing algorithms on FPGA(Field-Programmable Gateway arrays)  or GPU(Graphics Processing Unit) .  Jung at el[1],[2],[3] did substantial amount of work on implementing various Computer vision algorithms on FPGA board, and proved promising preliminary results for human face detection, real time vision processing and visual tracking.

Another  major limitations of embedded systems is in its hardware complexity in the view point of embedded system’s programmer(designer).  Many programmers do know C/C++ but only few of them know verilog or VHDL which is required to program embedded system hardware. One solution for this issue can be using Catapult C tool of Mentor Graphics.  In this project, we will propose our approach for above mentioned two problems of embedded systems. Specifically, we will implement face recognition [5]algorithm using Catapult C tool and its possibly implementation of GPU.  The detailed schedule of project is  as below.

January February March
Catapult/GPU learning Learning process
Catapult synthesis Start using Catapult Finish with Catapult
GPU Implementation GPU Early March Finish
Final Presentation Tentatively March 18

Refereences

[1] Junguk Cho, Bridget Benson, and Ryan Kastner, “Hardware Acceleration of Multi-view Face Detection,” IEEE Symposium on Application Specific Processors (SASP 2009), San Francisco, California, USA, July 27-28, 2009.

[2] Junguk Cho, Bridget Benson, Sunsern Cheamanunkul, and Ryan Kastner, “Increased Performance of FPGA-Based Color Classification System,” ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2010), to appear, 2009.

[3] Junguk Cho, Shahnam Mirzaei, and Ryan Kastner, “Real-Time Vision Processing,” the UCSD Jacob’s School of Engineering Research Expo, University of California San Diego, La jolla, California, USA, February 19, 2009.

[4] M. Turk, A. Pentland, Eigenfaces for Recognition, Journal of Cognitive Neurosicence, Vol. 3, No. 1, 1991, pp. 71-86

Real time Computer vision

In some sense Computer vision is reverse of Computer graphics. While computer graphics concerns how to put different image signals into a graphics format which is seeable by humans or readable by machines, Computer Vision
is more related about finding or extracting infromation from graphics produced by computer graphics.

Computer vision is newly growing area of Computer Science/Engineering majors.

Examples of applications of computer vision include systems for:

  • Detecting events (Surveillance)
  • Detecting objects ( Face recognition)
  • Robotics, especially for those vision based navigational systems
  • HCI
  • Modeling objects or environments
  • etc.

However, while Computer vision is still in its infancy, many interesting emerging problems are exist. Some of them are improving performance of different Computer vision algorithms based on hardware implementation.

Hardware implementation of existing Computer vision algorithms is important because it provides with real time capability.

One such example was done by Dr. Jung Uk Cho, would be implementing vision algorithms on FPGA.

In this homepage, hopefully, I will be posting some of works I will do in the future regarding computer vision algorithms implementation, and their applications, also their implementation on hardware (FPGA, GPU).



Follow

Get every new post delivered to your Inbox.