CODIFICADOR DE VIDEO VVC DE ALTA AFICIENCIA
PID2021-128167OA-I00
•
Nombre agencia financiadora Agencia Estatal de Investigación
Acrónimo agencia financiadora AEI
Programa Programa Estatal para Impulsar la Investigación Científico-Técnica y su Transferencia
Subprograma Subprograma Estatal de Generación de Conocimiento
Convocatoria Proyectos de I+D+I (Generación de Conocimiento y Retos Investigación)
Año convocatoria 2021
Unidad de gestión Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023
Centro beneficiario UNIVERSIDAD POLITECNICA DE MADRID
Identificador persistente http://dx.doi.org/10.13039/501100011033
Publicaciones
Found(s) 4 result(s)
Found(s) 1 page(s)
Found(s) 1 page(s)
Dataset para la determinación de áreas de interés para la compresión de vídeo
Archivo Digital UPM
- Quintana Agudo, Alejandro
- Blanco Álvarez, David
- Fernández Baizán, Covadonga
- Díaz Honrubia, Antonio Jesús
Actualmente se genera una gran cantidad de información audiovisual debido, entre otros factores, a la creciente popularización de las redes sociales o el incremento en la calidad de las películas y los videojuegos. Debido a ello, los flujos de vídeo son cada vez son más pesados y costosos de transmitir. Sumado a este factor, los estándares de compresión de vídeo más modernos no son abordables desde un punto de vista computacional usando hardware de usuario. Por ello, surgen nuevos enfoques como la codificación perceptual de vídeo, en la que se intentan comprimir de forma más agresiva ciertas partes del vídeo a las que se estima que el usuario prestará menos atención. Este trabajo presenta un dataset para entrenar modelos de atención por parte del usario que más tarde pueden usarse en el marco de la codificación perceptual de vídeo. Además, con dicho dataset se ha entrenado un modelo Random Forest que es capaz de predecir el 75% de áreas de interés para los espectadores.
Proyecto: MINECO//PID2021-128167OA-I00
A fast full partitioning algorithm for HEVC-to-VVC video transcoding using Bayesian classifiers
Archivo Digital UPM
- García Lucas, David
- Cebrián Márquez, Gabriel
- Díaz Honrubia, Antonio Jesús
- Mallikarachchi, Thanuja
- Cuenca, Pedro
The Versatile Video Coding (VVC) standard was released in 2020 to replace the High Efficiency Video Coding (HEVC) standard, making it necessary to convert HEVC encoded content to VCC to exploit its compression performance, which was achieved by using a larger block size of 128 × 128 pixels, among other new coding tools. However, 80.93 of the encoding time is spent on finding a suitable block partitioning. To reduce this time, this proposal presents an HEVC-to-VVC transcoding algorithm focused on accelerating the CTU partitioning decisions. The transcoder takes different information from the input bitstream of HEVC, and feeds it to two Bayes-based models. Experimental results show a time saving in the transcoding process of 45.40, compared with the traditional cascade transcoder. This time gain has been obtained on average for all test sequences in the Random Access scenario, at the expense of only 1.50 BD-rate.
A parallel quadtree partitioning algorithm for the VVC standard
Archivo Digital UPM
- González Ruiz, Alberto
- Díaz Honrubia, Antonio Jesús
- Tapia Fernández, Santiago
- García Lucas, David
- Cebrián Márquez, Gabriel
- Mengual Galán, Luis
The Versatile Video Coding (VVC) standard was released in 2020 as a replacement for the High Efficiency Video Coding (HEVC) standard. HEVC used a quadtree to decide the best spliting of Coding Tree Units (CTUs). While VVC maintains this approach, it also allows to search the best CTU splitting by using binary and ternary trees in the leafs of the quadtree, which introduces a very high computing cost, increasing the compression time by a disproportionate amount. Moreover, VVC enlarges the maximum size of a CTU from 64x64 to 128x128 pixels, which also increments the encoding time since the search to find the best splitting is bigger than in HEVC. To reduce this time, this proposal presents a parallelization algorithm of the quadtree in VVC, using its second level to distribute the work between different threads. Experimental results show that an acceleration of 1.72x is achieved in the VVC encoder, without any significant impact in term of BD-rate in the encoded video stream.
Dataset annotated with users' attention for VVC-based video blocks
e-cienciaDatos, Repositorio de Datos del Consorcio Madroño
- Kessler Martín, Jorge
- Ríos Sánchez, Belén
- Díaz Honrubia, Antonio Jesús
<p>Descripción del proyecto</p>
<p></p>
<p>The main objective of the project in which the dataset has been elaborated is to design and implement a video encoder that produces videos that comply with the VVC standard specifications, requiring less computation time (and thus less power consumption), and to obtain a higher compression rate without affecting the subjective quality.</p>
<p></p>
<p>Specifically, this work focuses on improving the efficiency of the compression factor obtained by the VVC standard using perceptal video coding techniques. With this dataset we intend to perform a study of which parts of the frames viewers tend to pay more attention to.</p>
<p></p>
<p></p>
<p>4. Descripción del dataset</p>
<p></p>
<p>This dataset provides a basis for the study of the areas of interest to which viewers pay attention during the viewing of video sequences encoded with the VVC standard. Attention has been calculated on the Coding Tree Units (CTUs) of the VVC video standard, i.e., on blocks of 128x128 pixels.</p>
<p></p>
<p>The dataset consists of a single CSV file with a total of 10 variables, 9 of them as input variables for the models that are learnt using the dataset and 1 as a variable to be predicted:</p>
<p></p>
<p> 1. pos_row: the relative position (between 0 and 1) of the CTU among all the rows of CTUs.</p>
<p> 2. pos_col: the relative position (between 0 and 1) of the CTU among all the columns of CTUs.</p>
<p> 3. Center_distance: the relative distance between the center of the CTU and the center of the frame (e.g., the center of the frame is represented as $(0.5, 0.5)$ in relative terms).</p>
<p> 4. Mean: the mean value among the luma pixels of the CTU.</p>
<p> 5. Median: the median value among the pixels of the CTU.</p>
<p> 6. Std: the standard deviation of the pixels inside the CTU.</p>
<p> 7. var: the variance value among the pixels of the CTU.</p>
<p> 8. Asymmetry: the skweness (or asymmetry) coefficient of the pixels of the CTU.</p>
<p> 9. Kurtosis: the kurtosis coefficient of the pixels of the CTU.</p>
<p> 10. Interest: the number of viewers that paid attention to the CTU (out of 34 viewers).</p>
<p>
<p></p>
<p>The main objective of the project in which the dataset has been elaborated is to design and implement a video encoder that produces videos that comply with the VVC standard specifications, requiring less computation time (and thus less power consumption), and to obtain a higher compression rate without affecting the subjective quality.</p>
<p></p>
<p>Specifically, this work focuses on improving the efficiency of the compression factor obtained by the VVC standard using perceptal video coding techniques. With this dataset we intend to perform a study of which parts of the frames viewers tend to pay more attention to.</p>
<p></p>
<p></p>
<p>4. Descripción del dataset</p>
<p></p>
<p>This dataset provides a basis for the study of the areas of interest to which viewers pay attention during the viewing of video sequences encoded with the VVC standard. Attention has been calculated on the Coding Tree Units (CTUs) of the VVC video standard, i.e., on blocks of 128x128 pixels.</p>
<p></p>
<p>The dataset consists of a single CSV file with a total of 10 variables, 9 of them as input variables for the models that are learnt using the dataset and 1 as a variable to be predicted:</p>
<p></p>
<p> 1. pos_row: the relative position (between 0 and 1) of the CTU among all the rows of CTUs.</p>
<p> 2. pos_col: the relative position (between 0 and 1) of the CTU among all the columns of CTUs.</p>
<p> 3. Center_distance: the relative distance between the center of the CTU and the center of the frame (e.g., the center of the frame is represented as $(0.5, 0.5)$ in relative terms).</p>
<p> 4. Mean: the mean value among the luma pixels of the CTU.</p>
<p> 5. Median: the median value among the pixels of the CTU.</p>
<p> 6. Std: the standard deviation of the pixels inside the CTU.</p>
<p> 7. var: the variance value among the pixels of the CTU.</p>
<p> 8. Asymmetry: the skweness (or asymmetry) coefficient of the pixels of the CTU.</p>
<p> 9. Kurtosis: the kurtosis coefficient of the pixels of the CTU.</p>
<p> 10. Interest: the number of viewers that paid attention to the CTU (out of 34 viewers).</p>
<p>
Proyecto: AEI//PID2021-128167OA-I00