Inference in Deep Learning

Опубликовано: 01 Октябрь 2024
на канале: Connor Shorten

6,806

108

This video explores NVIDIA's result on the MLPerf Inference Competition and other algorithmic advances in Inference such as Quantization, Architecture Search, Distillation, and Pruning!
Thanks for watching! Please Subscribe and Check out the Links below!

NVIDIA wins MLPerf Inference Benchmarks: https://news.developer.nvidia.com/mlp...
MLPerf Inference Overview:
https://www.mlperf.org/inference-over...
TensorRT
https://developer.nvidia.com/tensorrt
TensorRT NVIDIA Webinar
http://on-demand.gputechconf.com/gtcd...
TensorFlow Lite Quantization:
https://www.tensorflow.org/lite/perfo...
DeepCompression:
https://arxiv.org/pdf/1510.00149.pdf
Pruning Filters:https://openreview.net/pdf?id=rJqFGTslg
Lottery Ticket Hypothesis:
https://arxiv.org/pdf/1803.03635.pdf
Knowledge Distillation:
https://arxiv.org/pdf/1503.02531.pdf
DistilBERT:
https://arxiv.org/abs/1910.01108
DistilBERT Blog Post:
/ distilbert
INT4 Precision:
https://devblogs.nvidia.com/int4-for-...
Quantizing Deep CNNs:
https://arxiv.org/pdf/1806.08342.pdf
DAWNBench:
https://dawn.cs.stanford.edu/benchmar...
MLPerf Inference Benchmark (Paper):
https://arxiv.org/pdf/1911.02549.pdf