项目启动,本项目主要综合行为识别、目标检测、物体识别、文本情感分析进行开发.由于我负责行为识别和目标检测这一模块,只整理自己的工作内容
Vision
1 | Through image, video recognition, text recognition, the |
Environment Support
- Keras 2.0/2.2
- Tensorflow 1.2
PytorchMain References
I GET THE set of papers from HERE
VIDEO TO TEXT
IMAGE CAPTION
GET THE NAME FORM DETAIL F-CNN
CVPRTwo-stream
The development of my Action-Recgnization module is based on Two Stream 《Two-StreamConvolutional Networks for Action Recognition in Videos》.
Reasons of using Two-stream
The Action-Recgnization is developed on the way of Two-Stream recent years,And researchers have come out many papers on IEEE SCI and others.but the main reason is that I did Video caption last year, both of them are Analyzing Video Infomation, I want try other algorithm to finish my project in higher quality(get higher score).
Procedure
Graphviz
- using dot generate this picture
1
2
the basic of Two-Stream is The Fusion of spatiotemporal information in a dual stream network.
or
The KEY POINT is the better Fusion of spatial and temporal features
- The interaction between layers within a single network, such as ResNet/Inception.
- between dual-stream networks, including the exploration of different fusion methods. It is worth considering the structure of ResNet and connecting the dual-stream network.
This project use the second method.
Spatial network
It mainly captures important object features in video frames.
Time series network
both of them : finetune the ImageNet
Document-Quality Attributes
Each contains:
1 | . |
Avaliability | Performance | Modifiability | Usability | Security | Testability | |
---|---|---|---|---|---|---|
Scenario | Can not identify bad behavior | The exported files are shown well | New demands&Structural optimization | Customers want to export statistics file easily and need a reliable data | Databases is intruded | Unit testing |
Stimulus Source | System dependencies | Customers | Developers and Customers | Customers | Attackers | Developers |
Stimulus | Can’t solve information of the video | Exporting operation | Customers | Runtime(?) | Sql injection&entitlement | Unit testing each module |
Artifact | whole system | UI | system | UI | DBMS | Code |
Environment | Windows/Linux x86_64/32 in Runtime environment | Web browser | Runtime environment | Web browser | Firewall&Encryption | Runtime environment |
Response | Send a feedback to backend if can’t analyze the video from surveillance cameras;Retry if can’t export the list of score | Export statistics files in 10s | Extend and modify functions when come out a new demand | Provide a easy-operated UI and reliable information | Resist intrusion | Each module passed the Test Cases |
Response measure | within 5min;within 5s | within 10s | All modules is extensible and under the control of the evaluation indexs | The satisfaction of user | Database is protected | Developers |
Tactics | Retry Self-test | Increase Resource Efficiency | Split Module | aaa | warm backup | Limit Non-determinism |
Project Details
- This project explores prominent action recognition models with UCF-101 dataset
- Perfomance of different models are compared and analysis of experiment results are provided