课程：高级计算机视觉

章节大纲

选择节评分标准及答疑讨论

评分标准及答疑讨论

全部折叠展开全部
- 选择活动答疑及论文汇报报名
  
  答疑及论文汇报报名讨论区
- 选择活动《高级计算机视觉》选课动机分析
  
  《高级计算机视觉》选课动机分析文件
  
  根据80人的调查报告，有大概一半人是非常需要这门课来达到自己的学术目标的。希望大家一起努力。
- 选择活动评分标准
  
  评分标准网页
  
  平时作业70分。（明年将恢复到本门课设置的初衷，编程作业降低回60分）。作业0到作业8，作业0，10分；后8次作业每次满分15分（有的有pdf手写题目作业5分，编程10分；如果只有编程则15分），超过70分按70分计算，不接受补交，因为每次deadline为下次上课前，上课大课间将对本次作业进行讲解。代码补全作业（提供了基础starter代码的作业）禁止复制粘贴任何代码。其他作业允许使用网上代码，禁止使用（复制粘贴）同学他人作业代码，请同时提交readme文件具体说明使用网上代码的情况。
  
  完成书面综述报告5分。写法按小论文related work写法（中英文均可，要有类似however段落进行批判性总结分析）。
  
  完成批判性口头报告（每位参与者将通过一到两篇代表性科学论文对一个研究主题进行详细研究。参与者将在约 30 分钟的演讲中介绍论文的主要思想。目标是其他参与者在演讲后理解报告人对两篇论文的批判性分析。）15分，可做多次取最高分。
  
  完成对批判点的改进实验，10分。
  
  总计100分。
  
  报告和 review模版请大家注意按此总结多数情况下阅读的重点是：
  A、文章讨论的核心问题是什么？一般不超过三个，会在abstract和introduction里强调出来。在related work文章是怎么评价相关工作存在的问题的。
  B、各个核心问题的解法是什么？做了什么假设？
  先看摘要，再看Introduction知道文章想要讨论的问题/现象，然后直接跳到最后的总结，然后跳到关键例子。这是一种“输入”型的阅读法。
  
  如果你的阅读目的是想为作者处理过的核心问题提出新的解法，想要“输出”，那这时还要加上：
  C、各核心假设是基于什么样的证据和动因，论证是否充分？消融实验是否充分？
  D、解法里使用了什么样的技术，用得对不对？
  E、核心理论有什么样的prediction
  基本要实现看完就能写review section的程度。
选择节参考资料（推荐边看视频边看lecture notes、然后看教材、然后做作业）

参考资料（推荐边看视频边看lecture notes、然后看教材、然后做作业）
- 选择活动教材1 Computer Vision Algorithms and Applications 2ed
  
  教材1 Computer Vision Algorithms and Applications 2ed 资源库文件
  
  您必须至少添加一个完成条件
- 选择活动教材1 Computer Vision Algorithms and Applications计算机视觉--算法与应用
  
  教材1 Computer Vision Algorithms and Applications计算机视觉--算法与应用文件
  
  学生必须
  
  标记完成
- 选择活动ML4360 lecture notes
  
  ML4360 lecture notes 资源库文件
  
  您必须至少添加一个完成条件
- 选择活动教材2 计算机视觉中的多视图几何（原书第2版） (Richard Hartley, Andrew Zisserman, 韦穗, 章权兵)
  
  教材2 计算机视觉中的多视图几何（原书第2版） (Richard Hartley, Andrew Zisserman, 韦穗, 章权兵) 文件
  
  学生必须
  
  标记完成
- 选择活动schoenberger_thesis] Johannes L. Schönberger. “Robust Methods for Accurate and Efficient 3D Modeling from Unstructured Imagery.” ETH Zürich, 2018.
  
  schoenberger_thesis] Johannes L. Schönberger. “Robust Methods for Accurate and Efficient 3D Modeling from Unstructured Imagery.” ETH Zürich, 2018. 文件
  
  学生必须
  
  标记完成
- 选择活动Lecture:ML4360 Computer Vision， Eberhard Karls Universität Tübingen，Prof. Dr. Andreas Geiger
  
  Lecture:ML4360 Computer Vision， Eberhard Karls Universität Tübingen，Prof. Dr. Andreas Geiger 网页地址
  
  学生必须
  
  标记完成
- 选择活动百度云下载ML4360视频及ppt
  
  百度云下载ML4360视频及ppt 网页地址
  
  学生必须
  
  标记完成
  
  https://pan.baidu.com/s/1BPuSzCAfLqjwFHUCaIsobA?pwd=rs48

05 06 图像形成

选择活动阅读教材1 第2章图像形成教材2 4.1 DLT算法

阅读教材1 第2章图像形成教材2 4.1 DLT算法网页

学生必须

标记完成
选择活动Lecture 2.1 (Image Formation： Primitives and Transformations)

Lecture 2.1 (Image Formation： Primitives and Transformations) 资源库文件

您必须至少添加一个完成条件
选择活动Lecture 2.2 (Image Formation： Geometric Image Formation)

Lecture 2.2 (Image Formation： Geometric Image Formation) 资源库文件

学生必须

观看进度：90%
选择活动Lecture 2.3 (Image Formation： Photometric Image Formation)

Lecture 2.3 (Image Formation： Photometric Image Formation) 资源库文件

学生必须

观看进度：90%
选择活动Lecture 2.4 (Image Formation： Image Sensing Pipeline)

Lecture 2.4 (Image Formation： Image Sensing Pipeline) 资源库文件

学生必须

观看进度：90%
选择活动作业1 image_formation 含计算题和编程题(10.11上课前)

作业1 image_formation 含计算题和编程题(10.11上课前)

学生必须

标记完成

提交到

https://classroom.github.com/a/p_pWDT3Z

git clone https://github.com/GDUTCV/ <作业页面自动生成的> .git

cd hw01_image_formation/code/

Create a new environment named lecturecv and install required packages (numpy, etc.) via running:

conda env create -f environment.yml

Note: A typical source of error is to use an old version of conda itself. You can update it via:

conda update -n base conda -c anaconda

Before launching your notebook you need to activate the environment:

conda activate lecturecv

Depending on your configuration, you might instead need to run:

source activate lecturecv

You can now start jupyter notebook from the directory:

jupyter-notebook

A browser window should be opened in which you can open the notebook of the first exercise called image_formation.ipynb

也可以上传到colab然后做编程题。

也可以上传到google drive再用colab打开，此时需要加载drive文件夹

from google.colab import drive

drive.mount('/content/drive')

本周论文研讨网页

HAHA: Highly Articulated Gaussian Human Avatars with Textured Mesh Prior	https://arxiv.org/pdf/2404.01053	https://github.com/david-svitov/HAHA/
Emergent Correspondence from Image Diffusion	https://proceedings.neurips.cc/paper_files/paper/2023/file/0503f5dce343a1d06d16ba103dd52db1-Paper-Conference.pdf	https://diffusionfeatures. github.io
Recurrent Partial Kernel Network for Efficient Optical Flow Estimation	https://hmorimitsu.com/publication/2024-aaai-rpknet/2024-aaai-rpknet.pdf	https://github.com/hmorimitsu/ptlflow

选择节07描述子特征

07描述子特征
- 选择活动观看ML4360 lecture3.1 视频文件见参考资料百度云链接
  
  观看ML4360 lecture3.1 视频文件见参考资料百度云链接网页
  
  学生必须
  
  标记完成
- 选择活动阅读教材1 第4章(第二版是第7章) 特征检测与匹配 schoenberger thesis 第3章以及https://demuc.de/papers/schoenberger2017comparative.pdf
  
  阅读教材1 第4章(第二版是第7章) 特征检测与匹配 schoenberger thesis 第3章以及https://demuc.de/papers/schoenberger2017comparative.pdf 网页
  
  学生必须
  
  标记完成
- 选择活动作业2 复现schoenberger_phd_thesis Table 3.3: Results for our local feature benchmark. Fountain行代码参考https://github.com/ahojnnes/local-feature-evaluation
  
  作业2 复现schoenberger_phd_thesis Table 3.3: Results for our local feature benchmark. Fountain行代码参考https://github.com/ahojnnes/local-feature-evaluation 网页
  
  学生必须
  
  标记完成
  
  1.作业2 提交到https://classroom.github.com/a/gzMZO0nH阅读 https://github.com/ahojnnes/local-feature-evaluation/blob/master/INSTRUCTIONS.md 并配置环境准备数据
  
  2.运行理解代码 scripts/matching_pipeline.m
  
  3.运行理解代码
  
  scripts/reconstruction_pipeline.py
  
  4. 可视化论文图片结果
选择节08三维重建1

08三维重建1
- 选择活动观看ML4360 lecture3.2-3.4 视频文件见参考资料百度云链接
  
  观看ML4360 lecture3.2-3.4 视频文件见参考资料百度云链接网页
  
  学生必须
  
  标记完成
- 选择活动阅读相机标定相关 http://www.vision.caltech.edu/bouguetj/calib_doc/index.html I https://docs.opencv.org/master/dc/dbb/tutorial_py_calibration.html I See also: Szeliski Book, Chapter 11.1
  
  阅读相机标定相关 http://www.vision.caltech.edu/bouguetj/calib_doc/index.html I https://docs.opencv.org/master/dc/dbb/tutorial_py_calibration.html I See also: Szeliski Book, Chapter 11.1 网页
  
  学生必须
  
  标记完成
- 选择活动阅读教材1第7章（2ed 11章）
  
  阅读教材1第7章（2ed 11章）网页
  
  学生必须
  
  标记完成
- 选择活动作业3.1 完成ML4360作业2 中1.1 1.2 2.1 https://classroom.github.com/a/v3HF0XzS
  
  作业3.1 完成ML4360作业2 中1.1 1.2 2.1 https://classroom.github.com/a/v3HF0XzS
  
  学生必须
  
  标记完成
  
  提交到
  
  https://classroom.github.com/a/v3HF0XzS
选择节10三维重建2及colmap应用

10三维重建2及colmap应用
- 选择活动SfM1
  
  SfM1 资源库文件
  
  您必须至少添加一个完成条件
- 选择活动SfM2
  
  SfM2 资源库文件
  
  学生必须
  
  观看进度：90%
- 选择活动SfM3
  
  SfM3 资源库文件
  
  学生必须
  
  观看进度：90%
- 选择活动SfM4
  
  SfM4 资源库文件
  
  您必须至少添加一个完成条件
- 选择活动阅读教材2第18章、附录6、schoenberger_thesis的7 Structure-from-Motion Revisited
  
  阅读教材2第18章、附录6、schoenberger_thesis的7 Structure-from-Motion Revisited 网页
  
  学生必须
  
  标记完成
  
  阅读教材2第18章、附录6、schoenberger_thesis的7 Structure-from-Motion Revisited
  
  https://www.bilibili.com/video/BV1p24y1c7MH?spm_id_from=333.788.videopod.episodes&vd_source=d21f89893c9e3093d77e243b2e781f93&p=27
  
  手写Bundle Adjustment对三维重建/SLAM研究的作用
  
  https://blog.csdn.net/qq_15698613/article/details/102928884
- 选择活动作业4 完成如下作业, 详细要求见文件夹下lab3.pdf
  
  作业4 完成如下作业, 详细要求见文件夹下lab3.pdf
  
  学生必须
  
  标记完成
  
  提交到
  
  https://classroom.github.com/a/rQ6QqF65
选择节11多视角双目重建MVS

11多视角双目重建MVS
- 选择活动L04 - Stereo Reconstruction
  
  L04 - Stereo Reconstruction 网页
  
  https://www.bilibili.com/video/BV1ov411j7Uj?spm_id_from=333.788.videopod.episodes&vd_source=d21f89893c9e3093d77e243b2e781f93&p=12
- 选择活动作业3.2完成ML4360作业2 中剩余双目部分 https://classroom.github.com/a/v3HF0XzS (复制)
  
  作业3.2完成ML4360作业2 中剩余双目部分 https://classroom.github.com/a/v3HF0XzS (复制)
  
  学生必须
  
  标记完成
  
  https://classroom.github.com/a/v3HF0XzS
- 选择活动作业5.1 利用工二413多相机设备，三维重建学生本人三维头像、采集自然、高兴、生气、厌恶、害怕5种表情动作的4D数据
  
  作业5.1 利用工二413多相机设备，三维重建学生本人三维头像、采集自然、高兴、生气、厌恶、害怕5种表情动作的4D数据
  
  学生必须
  
  标记完成
- 选择活动作业5.2 学生本人头像稠密重建
  
  作业5.2 学生本人头像稠密重建
  
  学生必须
  
  标记完成
  
  作业5提交
  
  https://classroom.github.com/a/DuOfr7Sr
选择节12数字人三维重建

12数字人三维重建
- 选择活动SMPL made Simple Tutorial at CVPR 2021
  
  SMPL made Simple Tutorial at CVPR 2021 网页
  
  10:00-11:20 Introduction to the tutorial and learning objectives. Overview of 3D body models, the history, mesh registration, linear blend skinning, SMPL and related models. Contents: history of body models, scanning, registration, PCA, linear blend skinning, corrective blend shapes, SMPL, faces, hands, SMPL-X, dynamics of soft tissue, future directions like implicit surfaces and neural rendering. Instructor: Michael Black
  
  https://www.bilibili.com/video/BV1ysmtYjEYc/
  
  11:20-11:40 Fitting SMPL to images using optimization. Instructor: Dimitrios Tzionas
  
  https://www.bilibili.com/video/BV1Zom4YvEsy/
  
  11:40-12:00 Regressing SMPL from images. Instructor: Timo Bolkart
  
  https://www.bilibili.com/video/BV1oFmtYYEJQ/
- 选择活动siggraphasia2017 Learning a model of facial shape and expression from 4D scans
  
  siggraphasia2017 Learning a model of facial shape and expression from 4D scans 文件
  
  学生必须
  
  标记完成
  
  https://github.com/soubhiksanyal/FLAME_PyTorch
- 选择活动CVPR 2023 FFHQ-UV: Normalized Facial UV-Texture Dataset for 3D Face Reconstruction
  
  CVPR 2023 FFHQ-UV: Normalized Facial UV-Texture Dataset for 3D Face Reconstruction 文件
  
  学生必须
  
  标记完成
  
  https://github.com/csbhr/FFHQ-UV/tree/main
- 选择活动作业6 在作业5的基础上制作学生本人模型和UV纹理
  
  作业6 在作业5的基础上制作学生本人模型和UV纹理
  
  学生必须
  
  标记完成
  
  https://github.com/GDUTCV/homework-06
选择节13 图模型及光流法

13 图模型及光流法
- 选择活动作业7.1 选作业4的一个视角可视化学生本人头像的图模型光流
  
  作业7.1 选作业4的一个视角可视化学生本人头像的图模型光流
  
  学生必须
  
  标记完成
- 选择活动Probabilistic Graphical Models & Optical Flow
  
  Probabilistic Graphical Models & Optical Flow 网页
  
  https://www.bilibili.com/video/BV1ov411j7Uj?spm_id_from=333.788.videopod.episodes&vd_source=d21f89893c9e3093d77e243b2e781f93&p=17
  
  5.1 - Structured Prediction | Video
  
  5.2 - Markov Random Fields | Video
  
  5.3 - Factor Graphs | Video
  
  5.4 - Belief Propagation | Video
  
  5.5 - Examples | Video
  
  6.3 - Optical Flow | Video
选择节14深度学习及光流法

14深度学习及光流法
- 选择活动作业7.2 选作业4的一个视角可视化学生本人头像的GMA光流并比较差异
  
  作业7.2 选作业4的一个视角可视化学生本人头像的GMA光流并比较差异
  
  学生必须
  
  标记完成
- 选择活动Learning Optical Flow: from Model to Data
  
  Learning Optical Flow: from Model to Data 网页
  
  https://www.bilibili.com/video/BV1NyzuYGELa
  
  Sun, D., Yang, X., Liu, M.-Y., Kautz, J., 2018. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 8934–8943.
  
  Sun, Deqing, et al. "Autoflow: Learning a better training set for optical flow." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
  
  https://www.bilibili.com/video/BV1R9zvYcEsr
  
  Teed, Z., Deng, J., 2020. Raft: Recurrent all-pairs field transforms for optical flow In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16. Springer, pp. 402–419.
  
  https://www.bilibili.com/video/BV1Ww4m117aY
  
  Smith, C., Charatan, D., Tewari, A., & Sitzmann, V. (2024). FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent. arXiv preprint arXiv:2404.15259.
选择节15 NERF and 3D Gaussian

15 NERF and 3D Gaussian
- 选择活动Coordinate-based Networks
  
  Coordinate-based Networks 文件
  
  学生必须
  
  标记完成
- 选择活动3D Gaussian Splatting Tutorial - 3DV 2024
  
  3D Gaussian Splatting Tutorial - 3DV 2024 网页地址
  
  学生必须
  
  标记完成
  
  https://3dgstutorial.github.io/index.html
  
  3dgs cuda代码讲解 diff-gaussina-rasterization/cuda-rasterizer/forward.cu
  
  41:41
  
  https://www.bilibili.com/video/BV1yu411M7bQ/?spm_id_from=333.337.search-card.all.click&vd_source=d21f89893c9e3093d77e243b2e781f93
  
  https://www.bilibili.com/video/BV1PhgKeVE6f/?spm_id_from=333.337.search-card.all.click&vd_source=d21f89893c9e3093d77e243b2e781f93
- 选择活动3D Gaussian Splatting Tutorial - 3DV 2024
  
  3D Gaussian Splatting Tutorial - 3DV 2024 文件夹
  
  学生必须
  
  标记完成
- 选择活动作业7.3 在作业4的基础上生成多视角（413原多视角不够稠密）生成学生本人4DGS，可视化学生本人头像的4DGS光流和GMA光流并比较差异
  
  作业7.3 在作业4的基础上生成多视角（413原多视角不够稠密）生成学生本人4DGS，可视化学生本人头像的4DGS光流和GMA光流并比较差异
  
  学生必须
  
  标记完成
- 选择活动4DGS CUDA渲染代码改光流输出（张家霖）
  
  4DGS CUDA渲染代码改光流输出（张家霖）资源库文件
  
  学生必须
  
  观看进度：90%
选择节16 4DGS & 时空特征

16 4DGS & 时空特征
4DGS

event camera

时空 transformer

de Blegiers, Tristan, et al. "EventTransAct: A video transformer-based framework for Event-camera based action recognition." 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2023.
- 选择活动作业8 AU及表情检测demo
  
  作业8 AU及表情检测demo 网页
  
  利用gradio实现上述UI，可以参考作业0或其他网上代码。选择调用三年内发表论文的开源au及表情识别模型，实现上述功能。
- 选择活动作业八
  
  作业八
  
  提交到
  
  https://classroom.github.com/a/hIFixha1
选择节综述、论文报告提交链接

综述、论文报告提交链接
- 选择活动作业9
  
  作业9
  
  提交到https://classroom.github.com/a/hIFixha1
  
  截止日期2024年12月20日23时59分

MoML: Online Meta Adaptation for 3D Human Motion Prediction	https://openaccess.thecvf.com/content/CVPR2024/html/Sun_MoML_Online_Meta_Adaptation_for_3D_Human_Motion_Prediction_CVPR_2024_paper.html
HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting	https://arxiv.org/abs/2312.02902
GaussianEditor:Swift And Controllable 3D Editing With Gaussian Splatting	https://openaccess.thecvf.com/content/CVPR2024/papers/Chen_GaussianEditor_Swift_and_Controllable_3D_Editing_with_Gaussian_Splatting_CVPR_2024_paper.pdf	https://buaacyw.github.io/gaussian-editor

高级计算机视觉

章节大纲

评分标准及答疑讨论

参考资料（推荐边看视频边看lecture notes、然后看教材、然后做作业）

04 05计算机视觉基础

修改gradio窗口布局

增加AR demo

调用增加新的facial landmark detection模型

05 06 图像形成

07描述子特征

08三维重建1

10三维重建2及colmap应用

11多视角双目重建MVS

12数字人三维重建

13 图模型及光流法

14深度学习及光流法

15 NERF and 3D Gaussian

16 4DGS & 时空特征

综述、论文报告提交链接