I+D+BIT aglutina por vez primera el mejor talento investigador aplicado al audiovisual
Esta iniciativa, impulsada por BIT 2016, selecciona una treintena de proyectos de investigación en campos como audio y vídeo sobre IP; reconocimiento y generación de voz sintética; uso de metadatos aplicados al big data; realidad virtual, aumentada e inmersiva; interacción con objetos; neurociencia; streaming adaptativo; accesibilidad; análisis automático de vídeo y audio; producción 360º; motion capture; búsqueda, catalogación y recomendación de contenidos; usabilidad en tv interactiva o sincronización de contenidos híbridos.
Cerca de una treintena de proyectos de investigación y desarrollo se han presentado a la convocatoria I+D+BIT impulsada por BIT 2016, Professional Audiovisual Technology Show, organized by IFEMA from May 24 to 26, 2016.
These projects are being developed, both in the university and business spheres, in order to search for new technologies with high added value and growth potential applied to the audiovisual industry.
The R+D+BIT initiative thus wants to recognize the work of the professionals and entities that make up the world of audiovisual research, whose role is essential in this sector, making society aware of the most innovative projects.
Un Comité de Selección, presidido por Pere Vila, Director de Tecnología, Innovación y Sistemas de RTVE Corporación, han seleccionado un total de 25 proyectos teniendo en cuenta la oportunidad de la investigación llevada a cabo; su capacidad de influir en el futuro desarrollo de la industria audiovisual; su potencial de aplicación; la originalidad de su enfoque, método u objeto, y la capacidad de reunir y generar colaboración entre distintos actores interesados.
Those who visit BIT 2016 will be able to learn first-hand about the development of these projects thanks to a series of informative posters and presentation sessions.
Audio and video over IP; synthetic speech recognition and generation; use of metadata applied to big data; virtual, augmented and immersive reality; interaction with objects; neuroscience; production and distribution of scientific content; adaptive streaming; stage lighting; accessibility; automatic video and audio analysis; 360º production; motion capture; search, cataloging and recommendation of content; usability in interactive TV; synchronization of hybrid content... are some of the work areas applied to the audiovisual sector on which the selected projects in R+D+BIT investigate.
The main lines of each of the selected research projects are detailed below:
AEQ
AEQ is developing a project in collaboration with Neogroupe in which the uses of telephony in the broadcasts of two important European stations are analyzed: Radio Nacional de España and Radio France. Its objective is to adapt a broadcast telephony system to the requirements of users, taking into account its integration with pre-existing control applications, developing a tailored control protocol. In this project, which has the support of the CDTI, AEQ is working hand in hand with telephone managers, technical directors, producers, directors, controllers...
AEQ has installed in RNE studio 102 an operational model that parameterizes the airing of telephone calls and interoperability with large telephone systems and digital switchboards.
EUNISON
The generation of speech from fundamental physical principles using the current capacity of parallel supercomputing is the objective of the Eunison project (Extensive UNIfied-domain SimulatiON of the Human Voice) sponsored by the Grup de Recerca en Tecnologies Mèdia (GTRM) of La Salle (Universitat Rmon Llull), together with KTH, Gipsa-Lab, CIMNE (UPC) and FAU-Erlange. Based on magnetic resonance imaging of the larynx and vocal tract, and solving the relevant equations, the system is capable of generating sounds from a unified simulation engine validated based on experimental tests. Without a doubt, the ultimate goal of this interesting research is to make possible the dream of being able to reproduce the functioning of our speech apparatus.
The object of the research is so novel and challenging that it has been funded through the Future Emerging Technologies (FET) program of the European Commission within the seventh FP7 program with an endowment of 2.96 million euros.
Long-term perspectives include the development of natural voice synthesis applicable to the generation of audiovisual content such as film dubbing, complete simulation of virtual characters (including the actual voice generation process), new forms of cultural expression...
DIGIBIT
DigiBit and the Madrid Conservatory of Music are immersed in a practical project called MusicBit that aims to create a metadata base with music of various genres, with special attention to Spanish and European classical music from the Middle Ages to the present day. While most Internet music providers barely allow searching by artist, genre or song/work, this project aims to launch a database with 18 metadata fields. There is no database in the world that collects this information to date, which will be available for iOS and Android applications. At the moment, they have already incorporated more than 75,000 records, with plans to add another 25,000 of classical music, as well as other genres such as jazz, folk, pop and rock.
BRAINSTORM
One of the difficulties that production companies and broadcasters encounter when working with virtual studios is the complexity of the equipment and software used. Currently, only large content producers can afford to have the equipment, tools and human team capable of launching a virtual set. To “democratize” access to the use of virtual sets, as well as simplify their management and handling, Brainstorm Multimedia is developing, together with the Polytechnic of Valencia, the SmartSet project that not only reduces the complexity of the use of virtual sets but also aims to promote the launch of a marketplace in which users can download content, models and scenarios that they can personalize and adjust to their needs.
On the other hand, Brainstorm, in conjunction with several broadcasters such as RTVE, TVR or BlueSky, institutions such as the University of Surrey or the IRT and companies such as Never.no or Signum, is promoting a project whose main objective is to integrate, in real time, content generated through social networks with 3D graphics in real time, developing a completely integrated solution that collects the information from all social networks, structures it appropriately and represents it in the form of 3D graphics in augmented reality environments with the possibility of interaction on them by the presenter. The growing popularity of social networks, combined with the spectacular nature of 3D graphics in augmented reality environments, will offer a unique opportunity to meet this need and offer a unique tool to the sector.
INTERACTIVE PUPPETS
APACIA (Association of Professionals of Cultural Activities for Children and Adolescents) aims, together with Museum I+D+C and the Laboratory of Digital Culture and Hypermedia Museography (Complutense University), to establish a new narrative syntax through live digital puppets (avatars) based on Don Quixote de la Mancha in order to collect data and experiences that allow the real and safe interactivity demanded by new audiences on television.
DRAW
The Media Technologies Research Group (GTM) at La Salle Campus Barcelona (Ramon Llull University) is currently researching a new type of interaction with objects connected through drawing. The user draws the object they want to interact with on a mobile device and it allows them to do so. In addition, it will be possible to interact with more than one connected object at the same time, putting them in context. Thus, the user's mental load is drastically reduced since they only have to draw on the device, turning it into a useful and very simple interface for interaction with audiovisual products.
HDR
Developing image processing algorithms mainly based on vision sciences with application to the audiovisual industry is the main objective of the European Research Council (ERC) in collaboration with the Pompeu Fabra University (UPF) within the IP4EC (Image Processing for Enhanced Cinematography) project. This research team is using visual perception and neuroscience models to manipulate high dynamic range (HDR) images, adapt colors to the possibilities of each screen and optimize the appearance of the images. They are also working on noise removal and color stabilization in broadcast and post-production environments. With the development of HDR, there is a need to have methods to work with this technology and explore how to adapt it to content already produced (legacy). The project, developed by a team of nine professionals of different nationalities and disciplines, began in 2012 within the framework of the ERC Starting Grant and will conclude next year.
RECREATED
The Imageen company together with the Francisco de Vitoria University and the UNED are developing a prototype for the recreation of the creative process through virtual reality (RECREAT). This research aims to develop immersion technologies in works of art through an analysis focused on the author's creative process. RECREAT seeks to build models of immersion and reading of works of art that make it possible to shape the models of museums, art centers or transmedia... with narratives typical of the web 3.0 space, offering rich and multiple semantic levels of the work in the immersion process. This project could have a decisive impact on the generation of new immersive technologies and audiovisual narratives in the field of the arts and in the models of cultural institutions of the 21st century.
CLOUD LAB
The Hipermedia laboratory, a spin off of the Carlos III University of Madrid, is researching the creation, editing, storage and publication of interactive teaching-learning tools (such as presentations and different modalities of evaluation questions) based on the use of video through its management in a database with Cloud Computing technology. Given the increase in video and interactivity as key elements in the design of teaching materials, the possibility of creating and storing presentations and online tools together with the opportunity to edit video from the cloud provide added value. This tool is already being used by FIFA, AFC and the football federations of Japan, Belgium and Qatar, which base their training activities on video and “learning by doing” dynamics.
TRANSVIDEO ADAPT
Currently there are various proprietary solutions along with an international standard, MPEGDASH, for adaptive streaming. However, a solution is being demanded that makes it easier to provide a content distribution service regardless of the solution used. Along these lines, Nokia and the Polytechnic University of Madrid are developing the Tranvideoadap project, which aims to promote a platform that allows different formats of audiovisual content to be generated in real time from a single copy according to adaptive streaming schemes, minimizing the storage resources required. In this way, the content can adapt in real time to the customer's device.
The Transvideoadapt platform presents an alternative to the traditional business model of content distribution to a model where the Network infrastructure provider can add as a service the adaptation of content to the demand profiles required at each moment and device. In addition, support for MPEG-DASH will help advance the implementation of this standard as an integrative technological solution.
MAFILED
Marco Fidel Vargas, director of the International Solidarity School of Performing Arts Project, together with Jesús Marcos García, who achieved Honors in Electronic Engineering and Robotics at UNED with this idea, have presented to R+D+BIT a project, sponsored by UNED, which under the name MAFILED develops an alternative system for lighting in performing spaces, be they sets, theaters, auditoriums... aimed at ensuring the transit of professionals who circulate in poor light conditions. Ideal for security and work lighting, its telescopic and trapezoidal structure solves the problem of darkness in passage areas between curtains and sets.
MULTIDUB
The digital processing of the sound signal of audiovisual material both in cinema and television or other media for the identification of the content and the current playback moment, in order to synchronize (via headphones and smartphone) with alternative dubbing to that of the main screen is the object of the research carried out by Peranoid and the Polytechnic University of Valencia. In line with what Shazam does (in this case applied to musical identification), in the case of Multidub, in addition to identifying the audiovisual content, it allows the user to listen to the content they view with a dubbing different from that of the broadcast. Its promoters highlight that this idea will allow the dubbing industry to be democratized, monetizing it in places where it was previously not accessible.
Barely a year old, MultiDub has already been selected by two European acceleration platforms (CreatiFI and IMPACT) and by the Lanzadera accelerator (driven by businessman Juan Roig).
SOCIOGRAPH NEUROMARKETING
The Analytic System 3.0 system, developed by Sociograph Neuromarketing in collaboration with the University of Valladolid, allows the validation and objective measurement of the effectiveness of audiovisual content before it is broadcast. After years of experience and research, this team has developed an exclusive analysis methodology that, thanks to the use of algorithms, is capable of measuring the impact on a target audience. Large audiovisual groups such as Mediaset España have already tested this method to better validate and optimize the marketing of their audiovisual products with 100% success in their analysis.
UHD PRODUCTION
The Carlos III, Rey Juan Carlos and Complutense Universities together with 709 Mediaroom are developing a project to articulate a theoretical proposal on new audiovisual production technologies and their workflows in relation to Ultra High Definition. The new international standards for increasing the quality of visual and sound representation will have a strong impact on all subsectors of the audiovisual industry, but will also affect all creative processes and workflows of digital artists, evolving expressive and narrative forms. Getting ahead of these changes is one of the main objectives of this working group.
SMARDS
Scene Segmentation Solutions for Smart Advertising, Advanced Navigation and User Profiling (smArDS). With this name, a team from Ugiat Techbologies and the Universitat Politècnica de Catalunya is automatically analyzing audiovisual content using techniques based on low-level descriptors and computational learning algorithms. With this, they seek to obtain a hierarchical decomposition of the video into scenes, shot changes, phrases... that allows finding the best points for navigation or insertion of advertising. In this way, among other issues, the problem of random insertion of advertising that usually interrupts the content in the middle of a word or action could be avoided in non-linear television environments, in addition to improving the browsing experience with a player advancing with the possibility of going back or forward through scenes, shots, phrases... in a natural and intuitive way.
Designed as a software product, smArDS analyzes the video and extracts the associated metadata, determining the best points to insert advertising or the points with the highest ad completion rate, or even user profiles based on audiovisual metadata.
ZENIT-MOCAP
A research group from the Miguel de Cervantes European University is developing Motion Capture (MoCap) technology with applications to film, television and video games. Zenit-MoCap technology wants to become a tool to create animations directly by acquiring the movements of a real actor or character. Based on accelerometric, magnetometric and digital compass 3D inertial motion sensors, Zenit-MoCap allows, with the sole help of third-party software (Maya Unity Motion Builder), the integration of 3D CGI in real time.
The novelty of this project is that it is not limited to two stereoscopic cameras (Kinect style) but rather makes use of inertial systems that collect the acceleration of the joints, even correcting perceived deviations without the need for a complex camera capture system. By not depending on a video camera, blind spots and the loss of information about the captured object are minimized.
VIYOU
The Universities of Huelva and Almería are working on the study of the use of collaborative video annotations, that is, an annotation made by one user can in turn be commented on by another and so on. The research team aims to establish an online method of collaborative content learning. The method includes a collaborative content platform (Viyou) and a hardware system for content access and authentication, offering users a fast and scalable experience from any device. This experience seeks to guarantee the collaborative nature between teachers and students with the generation of collaborative reusable content that is accessible from any medium connected to the Internet, including smart TVs.
DISCLOSE
Usually the work of researchers does not reach the society that, to a large extent, finances it. For this reason, a project from the University of Vigo strives to facilitate the transfer of information between the field of research and society. They propose the use of attractive audiovisual techniques based on 2D and 3D animation that allow rigorous recreation of the processes and results obtained in an investigation. Currently, the Divulgare project is focused on the dissemination of knowledge in environmental and earth sciences, but it is potentially applicable to other areas of research, incorporating any type of algorithms and processes.
INDEXING AND SEMANTIC SEARCH
The Multimedia Technologies Group and the Information Society Services Group of the University of Vigo have developed a project that seeks to improve the search, cataloging and recommendation of audiovisual content through the combined processing of video, audio and text. To this end, they have launched a series of multimedia and natural language processing modules, integrating them into a demonstrative search and recommendation platform on an audiovisual repository.
This work team is based on the idea that the vast majority of search engines are quite blind to the enormous amount of information contained in audiovisual files, which is why it is necessary to develop powerful processing modules that allow metadata to be extracted and indexed for search and cataloguing. To do this, they use advanced processing technologies such as automatic transcription, semantic tag extraction, voice and facial movement analysis, subtitle analysis...
AUGMENTED REALITY IN URBAN ENVIRONMENTS
Developing an augmented reality application and its implementation on a mobile platform for the reconstruction of lost or hidden cultural heritage in urban environments is the objective of a project developed by the University of Zaragoza. The appearance of devices aimed at augmented reality and the growing use of smartphones for this purpose will be accompanied by the need to have new services and applications based on distributed computing between the cloud and the terminal.
Currently, augmented reality mobile applications are based on 2D markers or simple pre-trained geometries, which are problems that can be addressed with the computing power of available smartphones. In this project, the heaviest processes are moved to the cloud. For now, the University of Zaragoza has developed a pilot in which an application allows you to take a photograph from any position in the Plaza de San Felipe in the Aragonese capital and see that same photograph augmented with a recreation of the missing Torre Nueva in its current location.
USABILITY IN INTERACTIVE TV
Consolidating the line of research and development of applications and services for interactive digital television and smart TV, TV 3.0 applications, OTT and IPTV services, as well as researching production paradigms focused on the new environment of OTT media and distribution, are the objectives pursued by a research project being carried out by the Polytechnic University of Madrid. In this project, the problems that can be solved with the construction of content, applications and services supported by interactive digital television technologies are being outlined. In particular, it seeks to offer open source solutions, of collective interest, associated with improving the well-being of the population related to education, health, government... with emphasis on social and digital inclusion. The novelty of this initiative is that it involves the production of interactive learning objects with a SCORM-compatible TVDI format. Thus, systems, codecs, standards and peripherals will be generated in order to evaluate the new capabilities of television and online video repositories.
HYBRID SYNC
The Polytechnic University of Valencia is working on a hybrid and inter-recipient synchronization (IDMS) project to enable personalized, immersive and shared rich multimedia experiences. Among other aspects, it addresses one of the main challenges that arise when using broadcast networks (such as DVB) and broadband (such as the Internet) in a coordinated manner to offer broad and ubiquitous access to related multimedia content, highlighting the need for signaling and synchronization mechanisms for said content on one or multiple consumer devices. This would enable personalized, immersive, interactive and shared television experiences.
It should be noted that although the HbbTv 2.0 standard provides basic hybrid content synchronization mechanisms, either on a single device or in multi-screen scenarios, this project offers advanced and precise solutions for these functionalities, as well as for the synchronization of content between groups of remote users. In addition, multisensory elements such as aromas are added to traditional audiovisual content and AV conferencing tools are integrated to improve interactivity.
PLAYER4ALL
In order to facilitate the inclusion of all audiovisual accessibility services in a production and that each of them (audio description, sign language and subtitles) can be activated at will by each user, modifying the size and layout in a personalized way, is the main objective that has been set in the Player4All project by the Rey Juan Carlos University together with the CNLSE and EDSol Producciones.
The project would provide an improvement in the communication capacity and audiovisual accessibility in TV broadcasts, video on demand or broadcasts over the Internet. Player4All has already produced a video-on-demand player as the first fruit of its work. Currently, it is working on a second version that will work with live videos using the HbbTv standard. It is noteworthy that currently there is no other player that allows you to customize all accessibility services.
OURSELVES
VSN, in collaboration with the Pompeu Fabra University and the German broadcaster Deutsche Welle, together with several European laboratories and research centers, have launched the EUMSSI (Event Understanding through Multimodal Social Stream Interpretation) project. It seeks to provide the media with a platform from which to have direct access to data and multimedia information already analyzed, interpreted and cataloged according to its theme on the main events that occur in the world in order to offer verified and filtered information that allows them to carry out their work with higher quality without suffering waste of time and resources in the search for data through numerous online sources.
The technological commitment of the EUMSSI platform is based on a multimodal analytics system that helps organize, classify and group information flows from online and offline media, integrating content from various sources in an interactive way and enriching it with associated metadata.
Some of the substantial differences that distinguish the EUMSSI project from other proposals that incorporate multimodal search are the interoperability and interactivity offered by its data disaggregation system, thanks to the use of cutting-edge information analysis and extraction technologies. The platform is being developed on an open source license.
VOICE
LaSalle Campus Barcelona (Ramon Llull University) is immersed in a project that integrates text-to-speech (CTH) and voice transformation (TrV) technologies in order to develop a single personalized synthetic voice generation system through an intuitive user interface. The development of this technology is capable of converting any text message into the desired type of voice (child, robot, man, woman...) indicated by the user through an interface-equalizer from a single neutral voice in the desired language. In fields such as video game production, Voixter will make it possible to have a synthetic voice in the production phase, reducing costs and deadlines, leaving the voiceover intervention for the final production phase. Even as the synthetic quality of the technology allows for more natural and expressive voices, it would be possible to replace a professional announcer.
Did you like this article?
Subscribe to our NEWSLETTER and you won't miss anything.

















